Role description
Data modeling, Data warehousing, ETL pipelines, Flink, Kafka, Spark, Kinesis, Airflow, Python, Java, Scala, Monitoring
|
Years of Exp : 6+ Education : Bachelor's degree in Computer Science or related field.
|
Key Responsibilities: * Design, develop, and maintain scalable data pipelines and ETL processes * Optimize data flow and collection for cross-functional teams * Build infrastructure required for optimal extraction, transformation, and loading of data * Ensure data quality, reliability, and integrity across all data systems * Collaborate with data scientists and analysts to help implement models and algorithms * Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, etc. * Create and maintain comprehensive technical documentation * Evaluate and integrate new data management technologies and tools * Monitoring using tools like Datadog/Grafana
|
Mandatory Skills: * Extensive experience with big data technologies (e.g., Spark, Flink, Hadoop), Terraform, Cloud Formation, Kubernetes, DataDog/Grafana, CI/CD * Experience in monitoring using any one tool is mandatory - DataDog/Grafana and the like * Experience with containerization and orchestration tools (Kubernetes) * Experience with data modeling, data warehousing, and building ETL pipelines * Experience with cloud platforms (AWS, Azure, or GCP) and their data services. AWS Preferred * Experience with building streaming pipelines with flink, Kafka, Kinesis. Flink Preferred. * Strong knowledge of data pipeline and workflow management tools (e.g., Airflow, Luigi, NiFi) * Expert knowledge of SQL and experience with relational databases (e.g., PostgreSQL, Redshift, TIDB, MySQL, Oracle, Teradata) * Proficiency in at least one programming language such as Python, Java, or Scala * Understanding of data governance and data security principles * Experience with version control systems (e.g., Git) and CI/CD practices
|
Preferred Skills : * Basic knowledge of machine learning workflows and MLOps * Experience with NoSQL databases (MongoDB, Cassandra, etc.) * Familiarity with data visualization tools (Tableau, Power BI, etc.) * Experience with real-time data processing * Knowledge of data governance frameworks and compliance requirements (GDPR, CCPA, etc.)
|
Skills
data engineering,etl,terraform,azure data factory,
|