Data Infrastructure & Operations
Offer flexible and secure data ingestion, streaming, transformation, analytics and data lake storage paired with self-service compute & ML workspaces so that In-house data teams can spin-up services and create pipelines as per their business requirements
Job description
· Looking for 4-6 years of hand-ons experience with production-level development and operations on AWS or Azure Cloud
· Develop and maintain infrastructure-as-code using Terraform to deploy and manage Kubernetes clusters (AKS) and Databricks environments
· Hands-on experience with data pipeline orchestration tools like Azure Data Factory, Amazon data Pipeline, Apache Spark, Databricks
· Hands-on experience with one or more of stream & batch processing systems: Kafka (Confluent cloud, open source), Apache Storm, Spark-Streaming, Apache Flink, Kappa architecture
· Experience in architecting right storage strategy for use-cases, keeping data processing, data accessibility, data availability and cloud cost considerations
· Proficiency in Data transformation using Kstreams App/KSQL/Processor Libraries
· Data ingestion and data distribution integration experience using managed connectors such as Event Hubs, kafka topics, ADLS2, REST APIs
· Proficiency to set-up & manage open-source stack, including Airflow, Druid, Kafka (open source) OpenSearch, and Superset
· Proficiency in Python scripting for automation and integration tasks
· Utilize FastAPI for building and deploying high-performance APIs
· Handling requirements of managed services, IAM, auto-scaling, High availability, elasticity, networking options
· Handle federated access to cloud computing resource (or set of resources) based on a user's role within the organization
· Proficiency with Git, including branching/merging strategies, Pull Requests, and basic command line functions
· Proficiency in DevSecOps practices throughout the product lifecycle including fully managed Day 2 Ops leveraging Datadog
· Shared access controls to support multi-tenancy and self-service tooling for customers
· Manage data catalog per topic or domain based on services & use-cases offered
· Research, investigate and bring new technologies to continually evolve data platform capabilities
· Experience in working under Agile scrum Methodologies
Data Infrastructure & Operations
Offer flexible and secure data ingestion, streaming, transformation, analytics and data lake storage paired with self-service compute & ML workspaces so that In-house data teams can spin-up services and create pipelines as per their business requirements
Job description
· Looking for 4-6 years of hand-ons experience with production-level development and operations on AWS or Azure Cloud
· Develop and maintain infrastructure-as-code using Terraform to deploy and manage Kubernetes clusters (AKS) and Databricks environments
· Hands-on experience with data pipeline orchestration tools like Azure Data Factory, Amazon data Pipeline, Apache Spark, Databricks
· Hands-on experience with one or more of stream & batch processing systems: Kafka (Confluent cloud, open source), Apache Storm, Spark-Streaming, Apache Flink, Kappa architecture
· Experience in architecting right storage strategy for use-cases, keeping data processing, data accessibility, data availability and cloud cost considerations
· Proficiency in Data transformation using Kstreams App/KSQL/Processor Libraries
· Data ingestion and data distribution integration experience using managed connectors such as Event Hubs, kafka topics, ADLS2, REST APIs
· Proficiency to set-up & manage open-source stack, including Airflow, Druid, Kafka (open source) OpenSearch, and Superset
· Proficiency in Python scripting for automation and integration tasks
· Utilize FastAPI for building and deploying high-performance APIs
· Handling requirements of managed services, IAM, auto-scaling, High availability, elasticity, networking options
· Handle federated access to cloud computing resource (or set of resources) based on a user's role within the organization
· Proficiency with Git, including branching/merging strategies, Pull Requests, and basic command line functions
· Proficiency in DevSecOps practices throughout the product lifecycle including fully managed Day 2 Ops leveraging Datadog
· Shared access controls to support multi-tenancy and self-service tooling for customers
· Manage data catalog per topic or domain based on services & use-cases offered
· Research, investigate and bring new technologies to continually evolve data platform capabilities
· Experience in working under Agile scrum Methodologies
