
About this role
Data sits at the heart of the company. This role ensures Awin leverages data across the group to build reporting for better commercial decisions and best-in-class campaign management supporting client-facing departments. Design, build, and optimize scalable data pipelines, curated datasets, and analytical data models in Azure, AWS, and Databricks environments.
Work with large-scale datasets to improve performance, reliability, and translate business logic into well-structured tables, metrics, and transformation rules. Build and maintain ETL/ELT pipelines using Databricks, PySpark, Spark SQL, and Delta Lake. Develop production-ready notebooks, workflows, and data lake integrations while applying Spark optimization best practices.
Design curated datasets, semantic layers, and data marts that power analytics and reporting. Partner with business stakeholders, product owners, and analysts to align datasets with business processes and decision-making needs. Document data models, lineage, logic, and dataset behavior clearly.
Optimize transformations and storage for scale and cost; tune PostgreSQL databases and implement robust data validation and quality checks. Collaborate with engineers, analysts, and stakeholders to deliver reliable solutions. Communicate effectively with technical and non-technical audiences in Agile environments.
Requirements
- Bachelor’s degree in Computer Science, Data Engineering, or related field
- Hands-on experience with Databricks (PySpark, Spark SQL, Delta Lake, Lakebase), PostgreSQL
- Background working with large, distributed datasets
- Proficiency in Python, PySpark, and SQL
- Experience with data modeling, curated datasets, semantic layers, and medallion architecture
- Experience with AWS (Lambda, CloudWatch, and Step Functions)
- Competence in using Datadog or similar observability/monitoring platforms
- Strong debugging, problem-solving skills, Agile comfort, and commitment to thorough documentation
Responsibilities
- Build and maintain ETL/ELT pipelines using Databricks, PySpark, Spark SQL, and Delta Lake
- Develop production-ready notebooks, workflows, and data lake integrations
- Apply best practices for Spark optimization including partitioning, caching, avoiding shuffle, and file compaction
- Design curated datasets, semantic layers, and data marts that power analytics and reporting
- Partner with business stakeholders to understand requirements and align datasets with business processes
- Convert business requirements into data models defining tables, metrics, KPIs, and transformation rules
- Work with large datasets optimizing transformations and storage; tune PostgreSQL databases
- Implement robust data validation, schema enforcement, and quality checks across pipelines
Benefits
- Flexi-Week and Work-Life Balance with flexible four-day week at full pay
Similar roles

Senior Data Engineer
5d5 days agoMakpar
Washington, US · Full-time · $150,000 – $190,000

Senior Data Engineer
5d5 days agoPostbank
Berlin, DE · Full-time · €80,000 – €110,000

Infrastructure Engineering Manager
5d5 days agoFortnox
Växjö, SE · Full-time · SEK 800,000 – SEK 1,100,000

Senior AI/ML Engineer - Shared Services Automation - Remote
5d5 days agoMayo Clinic
Rochester, US · Full-time · $160,000 – $220,000