Senior Data Engineer

Job category:

Research and Development incl. Design

Department:

Cloud, Data & Connected Services

Organization:

Mercedes-Benz Research and Development India Private Limited

Location:

Mercedes-Benz Research and Development India Private Limited, Bangalore

Start date:

immediately

Publication date:

16.04.2026

Job number:

MER0003Z8A

Working time:

Full time

Apply

Tasks

Key Responsibilities

Design, develop, and maintain scalable batch and streaming data pipelines using Python (strong OOP design), Apache Spark (PySpark, Spark SQL), and Azure Databricks.
Design and implement real-time and near-real-time streaming platforms using Apache Kafka or Apache Flink, with stream processing powered by Apache Flink or Spark Structured Streaming, supporting event-time processing, windowing, stateful transformations, checkpointing, and exactly-once semantics.
Build and manage Medallion architecture (Bronze, Silver, Gold) on Azure Data Lake Storage (ADLS Gen2) using Delta Lake.
Implement data governance, access control, and lineage using Databricks Unity Catalog.
Develop and integrate data services and streaming consumers/producers within a microservices-based architecture.
Deploy and operate data and streaming workloads on Azure Kubernetes Service (AKS) using Dockerized applications.
Optimize Spark and streaming workloads through partitioning strategies, Z-Ordering, caching, broadcast joins, state tuning, and query optimization.
Implement data quality, validation, deduplication, late-arriving data handling, and schema evolution for both batch and streaming pipelines.
Collaborate with data analysts, data scientists, backend engineers, and DevOps teams to deliver end-to-end, production-grade data platforms.
Design secure data solutions using Azure IAM, RBAC, Key Vault, private endpoints, and network isolation.
Build and maintain CI/CD pipelines using GitLab for Databricks notebooks, Spark jobs, streaming applications, microservices, and infrastructure deployments.
Automate infrastructure and platform provisioning using Infrastructure as Code (Terraform and Pulumi).
Implement monitoring, logging, and alerting using Azure Monitor, Log Analytics, Databricks metrics, Kubernetes metrics, and streaming platform monitoring.
Deliver curated, analytics-ready datasets to support reporting and dashboards in Power BI.
Participate in Agile/Scrum ceremonies, perform code reviews, and mentor junior data engineers.
Ensure adherence to coding standards, performance best practices, data governance, and cloud-native architecture principles.

Required Skills & Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
3–8 years of experience in Data Engineering, with hands-on Azure experience.
Strong proficiency in Python with solid OOP concepts.
Extensive experience with Apache Spark (PySpark, Spark SQL) and Azure Databricks.
Hands-on experience with streaming platforms, such as:
- Apache Kafka OR Apache Flink
- Spark Structured Streaming
Familiarity with microservices-based architectures, including service communication patterns and event-driven designs.
Hands-on or working experience with Azure Kubernetes Service (AKS) for deploying and managing containerized workloads.
Deep understanding of Delta Lake (ACID transactions, schema enforcement, time travel).
Strong SQL skills for analytics and performance tuning.
Experience with Azure services:
- ADLS Gen2, Azure Databricks, Azure Functions, Service Bus, Key Vault
Strong experience with Git and GitLab CI/CD.
Experience with Infrastructure as Code using Terraform and Pulumi.
Knowledge of data modeling techniques (Star Schema, Data Vault, SCD Type 1/2).
Solid understanding of distributed systems, fault tolerance, and streaming guarantees.

Nice to Have

Experience with both Kafka and Flink in production environments.
Exposure to Kafka Schema Registry, event versioning, or CDC pipelines.
Hands-on experience with Databricks Unity Catalog.
Exposure to DevSecOps practices and secure data platform design.
Experience or exposure to Generative AI / LLM-based data applications (RAG, embeddings, vector stores, Azure OpenAI).
Experience with large-scale enterprise or regulatory data platforms.

Qualifications

Key Responsibilities

· Design and build multi-page, interactive Power BI dashboards featuring drill-throughs, bookmarks, custom visuals, and role-based filters to deliver tailored, actionable insights for diverse business users.

· Develop complex DAX measures, calculated tables, and time-intelligence functions to support dynamic KPI reporting and scenario modeling directly within the Power BI data model.

· Leverage Power Query (M-Query) to perform advanced data transformations—such as unpivoting, merging, conditional logic, and parameterization—ensuring source data is cleansed and shaped for optimal dashboard performance.

· Configure and maintain on-premises or cloud Power BI Gateways, automate incremental and full refresh schedules, and troubleshoot connectivity issues to guarantee up-to-date reporting.

· Optimize report and data model performance through query folding, aggregation tables, star-schema design patterns, and best-practice visuals to minimize load times and enhance user experience.

· Partner closely with stakeholders to elicit requirements, design storyboard prototypes, and iterate on layouts and visuals based on user feedback and adoption metrics.

· Architect and implement end-to-end ETL pipelines in Azure Data Factory, orchestrating data ingestion, transformation, delivery, error handling, and monitoring for reliable data flows.

· Develop and maintain PySpark notebooks on Azure Databricks to execute complex KPI logic, data cleansing, and aggregation workflows, supplemented by SQL scripts to validate ETL outputs and reconcile key metrics.

· Provision, secure, and govern data in Azure Data Lake Storage Gen2 via Unity Catalog, managing access controls, data lineage, schema evolution, and audit trails in compliance with organizational policies.

· Operate within an Agile/Scrum framework—participating in sprint planning, daily stand-ups, backlog grooming, and retrospectives—and perform root cause analyses to identify data quality and performance issues, implementing preventive measures.

Required Qualifications

Technical Skills required:

Power BI Desktop and Service, DAX, Power Query (M),
Python, Apache PySpark, SQL (T-SQL, ANSI),
Azure Data Factory, Databricks, ADLS Gen2, Unity Catalog, CI/CD tools (Git, Azure DevOps), data modeling, performance tuning.

Soft Skills required:

Strong problem-solving skills and attention to detail.
Excellent communication and collaboration skills.
Adaptability to changing priorities,

· Collaboration in cross-functional teams,

Time management, and a continuous-learning attitude.

Benefits

Discounts for Employees Possible

Health Benefits

Mobile Phone for Employees Possible

Meal-Discounts

Company Retirement

Hybrid Work Possible

Mobility Offers

Events for Employees

Coaching

Flextime Possible

Parking

Inhouse Doctor

Good Public Transport

Barrier-Free Workplace

Near-Site Childcare

Canteen, Café

Contact

Mercedes-Benz Research and Development India Private Limited

Brigade Tech Gardens, Katha No. 119

560037 Bengaluru

Details to location

Jacob Cyril

E-Mail: jacob.cyril@mercedes-benz.com

Apply

The Mercedes-Benz Group.

The Mercedes-Benz Group AG (former Daimler AG) is one of the world's most successful automotive companies. With Mercedes-Benz AG, we are one of the leading global suppliers of premium and luxury cars and vans. Mercedes-Benz Mobility AG offers financing, leasing, car subscription and car rental, fleet management, digital services for charging and payment, insurance brokerage, as well as innovative mobility services.

Learn more