Data & Software Engineer

Quantum Science Solutions

McLean, United States of America

2 months ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

McLean, United States of America

Tech stack

Geographic Information Systems

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Apache HTTP Server

Bash

Big Data

Information Engineering

Data Governance

Data Infrastructure

ETL

Data Visualization

Relational Databases

Database Queries

DevOps

Amazon DynamoDB

Python

PostgreSQL

Machine Learning

Metadata Repositories

MySQL

NoSQL

NumPy

PostGIS

Query Optimization

Software Deployment

Software Engineering

Data Streaming

Systems Integration

Data Processing

Cloud Platform System

Spark

GIT

Cloudformation

Pandas

Containerization

Data Lake

PySpark

Data Lineage

Terraform

Software Version Control

Data Pipelines

Docker

Job description

QSSHire is seeking a Data & Software Engineer to support a Project, focused on building complex data flows and scalable data solutions for a custom application., * Design, build, and maintain end-to-end data pipelines using Python

Develop and deploy data workflows using orchestration tools (e.g., Airflow, Spark job orchestration)
Containerize and deploy applications in AWS cloud environments
Configure and optimize Spark and PySpark jobs for large-scale data processing
Work with stakeholders to understand requirements and design scalable data solutions with minimal oversight
Troubleshoot data quality issues, pipeline failures, and performance bottlenecks
Support large-scale data migration and platform modernization efforts
Optimize relational databases (MySQL, PostgreSQL) for analytical workloads, including schema design and query tuning
Implement and maintain data lineage, cataloging, and governance solutions
Work with geospatial data formats and tools
Integrate AI/ML models and services into data pipelines
Develop automation scripts using Bash for data processing and system tasks
Contribute to data engineering documentation, standards, and best practices

Requirements

This role requires a highly skilled engineer with strong Python expertise, experience building production-grade ETL pipelines, and deep knowledge of data governance, security, and compliance principles. The ideal candidate will bring experience working with modern data platforms, cloud-native technologies, and large-scale data processing frameworks., * 5+ years of experience in data engineering or software engineering roles

Strong experience with Apache Spark & PySpark
Advanced proficiency in Python (Pandas, NumPy)
Experience building scalable ETL/data pipelines in production environments
Hands-on experience with AWS services (S3, Lambda, Step Functions)
Experience with containerization tools (Docker, Podman)
Strong SQL skills, including experience with Trino
Experience with NoSQL databases (DynamoDB)
Familiarity with data lake technologies (Apache Iceberg)
Experience with data orchestration tools (Airflow or similar)
Experience using Terraform or CloudFormation for infrastructure as code
Experience with data lineage and governance tools (OpenLineage, Unity Catalog OSS, Apache Polaris)
Experience with Apache Superset for data visualization
Experience with geospatial technologies (H3, PostGIS)
Strong understanding of version control and DevOps practices (Git, IaC workflows)
Experience working with data catalogs and diverse data formats, * Experience integrating AI/ML models into data workflows
Experience supporting data platform modernization initiatives
Strong background in data governance, privacy, and compliance frameworks
Experience working in agile, fast-paced environments

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all