DE EN HU Provider/Privacy
CompanyTechnologySustainabilityCareersInvestorsPress Products
CareersJob search
Senior Data Engineer
Tasks
Key Responsibilities
  • Design, develop, and maintain scalable batch and streaming data pipelines using Python (strong OOP design), Apache Spark (PySpark, Spark SQL), and Azure Databricks.
  • Design and implement real-time and near-real-time streaming platforms using Apache Kafka or Apache Flink, with stream processing powered by Apache Flink or Spark Structured Streaming, supporting event-time processing, windowing, stateful transformations, checkpointing, and exactly-once semantics.
  • Build and manage Medallion architecture (Bronze, Silver, Gold) on Azure Data Lake Storage (ADLS Gen2) using Delta Lake.
  • Implement data governance, access control, and lineage using Databricks Unity Catalog.
  • Develop and integrate data services and streaming consumers/producers within a microservices-based architecture.
  • Deploy and operate data and streaming workloads on Azure Kubernetes Service (AKS) using Dockerized applications.
  • Optimize Spark and streaming workloads through partitioning strategies, Z-Ordering, caching, broadcast joins, state tuning, and query optimization.
  • Implement data quality, validation, deduplication, late-arriving data handling, and schema evolution for both batch and streaming pipelines.
  • Collaborate with data analysts, data scientists, backend engineers, and DevOps teams to deliver end-to-end, production-grade data platforms.
  • Design secure data solutions using Azure IAM, RBAC, Key Vault, private endpoints, and network isolation.
  • Build and maintain CI/CD pipelines using GitLab for Databricks notebooks, Spark jobs, streaming applications, microservices, and infrastructure deployments.
  • Automate infrastructure and platform provisioning using Infrastructure as Code (Terraform and Pulumi).
  • Implement monitoring, logging, and alerting using Azure Monitor, Log Analytics, Databricks metrics, Kubernetes metrics, and streaming platform monitoring.
  • Deliver curated, analytics-ready datasets to support reporting and dashboards in Power BI.
  • Participate in Agile/Scrum ceremonies, perform code reviews, and mentor junior data engineers.
  • Ensure adherence to coding standards, performance best practices, data governance, and cloud-native architecture principles.
Required Skills & Qualifications
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • 3–8 years of experience in Data Engineering, with hands-on Azure experience.
  • Strong proficiency in Python with solid OOP concepts.
  • Extensive experience with Apache Spark (PySpark, Spark SQL) and Azure Databricks.
  • Hands-on experience with streaming platforms, such as:
    • Apache Kafka OR Apache Flink
    • Spark Structured Streaming
  • Familiarity with microservices-based architectures, including service communication patterns and event-driven designs.
  • Hands-on or working experience with Azure Kubernetes Service (AKS) for deploying and managing containerized workloads.
  • Deep understanding of Delta Lake (ACID transactions, schema enforcement, time travel).
  • Strong SQL skills for analytics and performance tuning.
  • Experience with Azure services:
    • ADLS Gen2, Azure Databricks, Azure Functions, Service Bus, Key Vault
  • Strong experience with Git and GitLab CI/CD.
  • Experience with Infrastructure as Code using Terraform and Pulumi.
  • Knowledge of data modeling techniques (Star Schema, Data Vault, SCD Type 1/2).
  • Solid understanding of distributed systems, fault tolerance, and streaming guarantees.
Nice to Have
  • Experience with both Kafka and Flink in production environments.
  • Exposure to Kafka Schema Registry, event versioning, or CDC pipelines.
  • Hands-on experience with Databricks Unity Catalog.
  • Exposure to DevSecOps practices and secure data platform design.
  • Experience or exposure to Generative AI / LLM-based data applications (RAG, embeddings, vector stores, Azure OpenAI).
  • Experience with large-scale enterprise or regulatory data platforms.
Qualifications

Key Responsibilities

·       Design and build multi-page, interactive Power BI dashboards featuring drill-throughs, bookmarks, custom visuals, and role-based filters to deliver tailored, actionable insights for diverse business users.

·       Develop complex DAX measures, calculated tables, and time-intelligence functions to support dynamic KPI reporting and scenario modeling directly within the Power BI data model.

·       Leverage Power Query (M-Query) to perform advanced data transformations—such as unpivoting, merging, conditional logic, and parameterization—ensuring source data is cleansed and shaped for optimal dashboard performance.

·       Configure and maintain on-premises or cloud Power BI Gateways, automate incremental and full refresh schedules, and troubleshoot connectivity issues to guarantee up-to-date reporting.

·       Optimize report and data model performance through query folding, aggregation tables, star-schema design patterns, and best-practice visuals to minimize load times and enhance user experience.

·       Partner closely with stakeholders to elicit requirements, design storyboard prototypes, and iterate on layouts and visuals based on user feedback and adoption metrics.

·       Architect and implement end-to-end ETL pipelines in Azure Data Factory, orchestrating data ingestion, transformation, delivery, error handling, and monitoring for reliable data flows.

·       Develop and maintain PySpark notebooks on Azure Databricks to execute complex KPI logic, data cleansing, and aggregation workflows, supplemented by SQL scripts to validate ETL outputs and reconcile key metrics.

·       Provision, secure, and govern data in Azure Data Lake Storage Gen2 via Unity Catalog, managing access controls, data lineage, schema evolution, and audit trails in compliance with organizational policies.

·       Operate within an Agile/Scrum framework—participating in sprint planning, daily stand-ups, backlog grooming, and retrospectives—and perform root cause analyses to identify data quality and performance issues, implementing preventive measures.

Required Qualifications

Technical Skills required:

  • Power BI Desktop and Service, DAX, Power Query (M),
  • Python, Apache PySpark, SQL (T-SQL, ANSI),
  • Azure Data Factory, Databricks, ADLS Gen2, Unity Catalog, CI/CD tools (Git, Azure DevOps), data modeling, performance tuning.

Soft Skills required:

  • Strong problem-solving skills and attention to detail.
  • Excellent communication and collaboration skills.
  • Adaptability to changing priorities,

·       Collaboration in cross-functional teams,

Time management, and a continuous-learning attitude.

Benefits
Discounts for Employees Possible
Health Benefits
Mobile Phone for Employees Possible
Meal-Discounts
Company Retirement
Hybrid Work Possible
Mobility Offers
Events for Employees
Coaching
Flextime Possible
Parking
Inhouse Doctor
Good Public Transport
Barrier-Free Workplace
Near-Site Childcare
Canteen, Café
ContactMercedes-Benz Research and Development India Private Limited LogoMercedes-Benz Research and Development India Private Limited
Brigade Tech Gardens, Katha No. 119560037 BengaluruDetails to location
Apply