Data Engineer - Hybrid
Role details
Job location
Tech stack
Job description
We are looking for a Data Engineer who is passionate about building scalable data systems and enabling high-quality analytics, machine learning, and business intelligence. In this role, you will design modern data pipelines, optimize storage solutions, and collaborate closely with cross-functional teams to deliver reliable, high-impact data infrastructure. If you enjoy solving complex data challenges and want to contribute to a fast-moving engineering environment, this role offers the opportunity to make a meaningful impact., * Design and build robust, scalable data pipelines to ingest structured and unstructured data from diverse sources.
- Develop and optimize data storage solutions, including relational databases, NoSQL systems, and data lakes.
- Implement data validation, automated testing, and monitoring to ensure accuracy, reliability, and compliance.
- Partner with Data Scientists, Product Managers, and Software Engineers to build infrastructure that supports ML models, analytics, and BI dashboards.
- Participate in feature discussions, help prioritize work, and guide teams toward consensus.
- Develop prototypes to validate concepts and accelerate solution design.
- Identify root causes of data issues, assess business impact, and propose effective solutions.
- Solve complex problems related to data quality, object identity, and error handling.
- Support MLOps workflows and contribute to model deployment and data integration processes.
- Work as part of an Agile/Scrum team and participate in related ceremonies.
- Maintain clear documentation for data models, pipelines, and operational processes.
Requirements
- 5+ years of overall experience in data engineering or related fields.
- 3+ years of experience building data pipelines using PySpark, Django, or similar frameworks, with strong proficiency in Python or Java/Scala.
- 2+ years of advanced SQL experience, including database modeling and query optimization.
- Hands-on experience with Apache Spark, Kafka, and distributed data processing technologies.
- Experience deploying data platforms in cloud environments such as AWS, Google Cloud, or Azure.
- Strong Python experience, including libraries such as NumPy and pandas for data loading and transformation.
- Experience scheduling and orchestrating workflows using tools like Apache Airflow or dbt.
- Practical experience with MLOps concepts and supporting ML model lifecycle workflows.
- Working knowledge of relational and non-relational databases, including data modeling principles.
- Experience collaborating within Agile/Scrum teams and participating in sprint ceremonies.
- Working knowledge of Unix/Linux environments.
Benefits & conditions
Where required by law, NTT DATA provides a reasonable range of compensation for specific roles. The starting pay range for this remote role is $75,168-$130,500. This range reflects the minimum and maximum target compensation for the position across all US locations. Actual compensation will depend on a number of factors, including the candidate's actual work location, relevant experience, technical skills, and other qualifications.
This position is eligible for company benefits including medical, dental, and vision insurance with an employer contribution, flexible spending or health savings account, life and AD&D insurance, short- and long-term disability coverage, paid time off, employee assistance, participation in a 401k program with company match, and additional voluntary or legally-required benefits.