Lead Data Engineer

Nityo Infotech Corporation
Los Angeles, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Los Angeles, United States of America

Tech stack

Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Big Data
Information Engineering
ETL
Data Warehousing
DevOps
Python
PostgreSQL
Workflow Management Systems
Data Processing
Database Optimization
Containerization
PySpark
Information Technology
Cloud Integration
Cloudwatch
Data Pipelines
Docker

Job description

  • Design and Develop Data Pipelines: Create and maintain scalable data pipelines using Python and PySpark to process large volumes of data efficiently.
  • Cloud Integration: Utilize AWS services (such as S3, CloudWatch, ECS, ECR, Lambda) to build and manage cloud-based data solutions.
  • Database Management: Design, implement, and optimize PostgreSQL databases to ensure high performance and reliability.
  • Workflow Orchestration: Use Apache Airflow to schedule and monitor complex data workflows.
  • Containerization: Implement and manage Docker containers to ensure consistent and reproducible environments for data processing tasks.
  • Data Quality and Governance: Ensure data quality, integrity, and security across all data pipelinesand storage solutions.
  • Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
  • Mentorship: Provide guidance and mentorship to junior data engineers and contribute to the continuous improvement of the team s skills and processes.

Requirements

We are seeking a highly skilled and experienced Lead Data Engineer to join our dynamic team. The ideal candidate will have a strong background in data engineering, with extensive experience in Python, PySpark, AWS services, PostgreSQL, Apache Airflow and Docker.

This role requires a professional with a proven track record of designing, implementing, and maintaining robust data pipelines and architectures., Education: Bachelor s or Master s degree in Computer Science, Engineering, or a related field.

Experience: 10+ years of overall experience in data engineering or related fields.

Technical Skills:

  • Proficiency in Python and PySpark.
  • Extensive experience with AWS services (S3, CloudWatch, ECS, ECR, Secrets Manager, Cloud9 IDE).
  • Strong knowledge of PostgreSQL and database optimization techniques.
  • Hands-on experience with Apache Airflow for workflow orchestration.
  • Proficiency in Docker for containerization.

Soft Skills:

  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration abilities.
  • Ability to work in a fast-paced, dynamic environment.
  • Preferred Qualifications:
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Knowledge of data warehousing concepts and ETL processes.

Apply for this position