Data & Software Engineer

Quantum Science Solutions
McLean, United States of America
19 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

McLean, United States of America

Tech stack

Geographic Information Systems
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Apache HTTP Server
Bash
Big Data
Information Engineering
Data Governance
Data Infrastructure
ETL
Data Visualization
Relational Databases
Database Queries
DevOps
Amazon DynamoDB
Python
PostgreSQL
Machine Learning
Metadata Repositories
MySQL
NoSQL
NumPy
PostGIS
Query Optimization
Software Deployment
Software Engineering
Data Streaming
Systems Integration
Data Processing
Cloud Platform System
Spark
GIT
Cloudformation
Pandas
Containerization
Data Lake
PySpark
Data Lineage
Terraform
Software Version Control
Data Pipelines
Docker

Job description

QSSHire is seeking a Data & Software Engineer to support a Project, focused on building complex data flows and scalable data solutions for a custom application., * Design, build, and maintain end-to-end data pipelines using Python

  • Develop and deploy data workflows using orchestration tools (e.g., Airflow, Spark job orchestration)
  • Containerize and deploy applications in AWS cloud environments
  • Configure and optimize Spark and PySpark jobs for large-scale data processing
  • Work with stakeholders to understand requirements and design scalable data solutions with minimal oversight
  • Troubleshoot data quality issues, pipeline failures, and performance bottlenecks
  • Support large-scale data migration and platform modernization efforts
  • Optimize relational databases (MySQL, PostgreSQL) for analytical workloads, including schema design and query tuning
  • Implement and maintain data lineage, cataloging, and governance solutions
  • Work with geospatial data formats and tools
  • Integrate AI/ML models and services into data pipelines
  • Develop automation scripts using Bash for data processing and system tasks
  • Contribute to data engineering documentation, standards, and best practices

Requirements

This role requires a highly skilled engineer with strong Python expertise, experience building production-grade ETL pipelines, and deep knowledge of data governance, security, and compliance principles. The ideal candidate will bring experience working with modern data platforms, cloud-native technologies, and large-scale data processing frameworks., * 5+ years of experience in data engineering or software engineering roles

  • Strong experience with Apache Spark & PySpark
  • Advanced proficiency in Python (Pandas, NumPy)
  • Experience building scalable ETL/data pipelines in production environments
  • Hands-on experience with AWS services (S3, Lambda, Step Functions)
  • Experience with containerization tools (Docker, Podman)
  • Strong SQL skills, including experience with Trino
  • Experience with NoSQL databases (DynamoDB)
  • Familiarity with data lake technologies (Apache Iceberg)
  • Experience with data orchestration tools (Airflow or similar)
  • Experience using Terraform or CloudFormation for infrastructure as code
  • Experience with data lineage and governance tools (OpenLineage, Unity Catalog OSS, Apache Polaris)
  • Experience with Apache Superset for data visualization
  • Experience with geospatial technologies (H3, PostGIS)
  • Strong understanding of version control and DevOps practices (Git, IaC workflows)
  • Experience working with data catalogs and diverse data formats, * Experience integrating AI/ML models into data workflows
  • Experience supporting data platform modernization initiatives
  • Strong background in data governance, privacy, and compliance frameworks
  • Experience working in agile, fast-paced environments

Apply for this position