Data Engineer
Role details
Job location
Tech stack
Job description
Design and build robust data pipelines to ingest structured and unstructured data from multiple sources. Build and optimize data storage solutions (relational/NoSQL databases, data lakes) to handle scale and performance. Implement validation checks, automated testing, and data monitoring to ensure accuracy and compliance. Partner with Data Scientists, Product Managers, and Software Engineers to build infrastructure that supports machine learning models and BI dashboards. Negotiate features and associated priorities and help the team and their customers reach consensus. Develops and/or leads the development of prototypes, Identify problem causality, business impact and root causes. Coming up with exact solutions for problems related to object identity and error handling.
Requirements
Overall 5+ years of experience. 3+ Years of experience in building data pipelines using PySpark, Django, High proficiency in Python or Java/Scala. 2+ Years of experience in Advanced SQL skills and experience with database modeling. Hands-on experience with technologies like Apache Spark and Kafka. Familiarity with deploying data platforms in cloud environments such as AWS, Google Cloud, or Azure. Hands-On experience in working with Python and related packages (like NumPy, pandas etc.) to load and scrap the data. Experience scheduling data workflows using tools like Apache Airflow or dbt. Should have hands on experience on the MLOps. Working experience on Relational/Non-relational databases and familiarity with data model concepts Working exposure in blending as part of larger scrum team and understanding of related scrum ceremonies Working knowledge of Unix/Linux.