Data & Software Engineer

GRVTY, LLC
McLean, United States of America
27 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
McLean, United States of America

Tech stack

Java
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Apache HTTP Server
Bash
Big Data
Computer Programming
System Configuration
Information Engineering
ETL
Data Security
Software Debugging
Software Design Patterns
Amazon DynamoDB
Python
PostgreSQL
Metadata Repositories
MySQL
NoSQL
NumPy
Operational Databases
Performance Tuning
PostGIS
Query Optimization
Azure
Software Deployment
SQL Databases
Data Streaming
Systems Integration
Data Processing
Cloud Platform System
Spark
GIT
Cloudformation
Pandas
Containerization
PySpark
Data Lineage
Terraform
Data Pipelines
Docker

Job description

  • Work with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight
  • Leverage strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks
  • Leverage a background in large-scale data migration or platform modernization efforts
  • Contribute to data engineering documentation, best practices, and design patterns.

Requirements

GRVTY is seeking a Data & Software Engineer with a TS/SCI + Poly clearance (applicable to this customer) to join one of our top projects in McLean, VA.The Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale.

Candidate must have experience:

  • Building end-to-end data pipelines leveraging Python
  • Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs
  • Containerizing and deploying applications in cloud environments like AWS.
  • Working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads.
  • Leveraging industry standard tools for code control (Git, IaaC control, etc.)
  • Working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial.
  • Using Bash scripting for automation and data processing tasks
  • Integrating Al/ML services and models, * Active TS/SCI with Polygraph Clearance
  • Minimum of 5 years' experience with:
  • Apache Spark & PySpark
  • Advanced Python skills (including Pandas & NumPy)
  • Docker, Podman
  • AWS S3, Lambda & Step functions
  • Apache Iceberg, Airflow, etc.
  • SQL (with Trino)
  • NoSQL, DynamoDB
  • Unity Catalog OSS, Apache Polaris
  • Apache Superset
  • Terraform or CloudFormation
  • OpenLineage
  • H3, PostGIS

About the company

GRVTY's team provides tactical data engineering solutions. We embed skilled Data Engineers, Data Scientists, and ETL Developers directly into intelligence analyst groups to be their go-to data wranglers. We develop new tools, code, and services to execute data engineering activities. Our engineers work to collect, process, and feed analytic tools, turning data into intelligence in response to immediate mission needs, with direct impact on real world situations. You will see your work used here on a daily basis, and you'll have the opportunity to support a variety of Sponsor mission organizations and mission partner organizations. This is a time of development and growth on the program, with an increasing number of missions being supported. The work is high impact and important, and the customer moves quickly. The environment is fast-paced, flexible, and open to innovation - you'll have more latitude here in choosing how to achieve results than on many other projects. The customer cares more about what you can do as opposed to your years of experience, and work hours are typically quite flexible - roll up your sleeves, get things done, and no one cares much about the specific hours that you work. The work space itself is also quite nice, and there is an excellent cafeteria! The tech stack on this team is rather huge and includes Python (Pandas, numpy, scipy, scikit-learn, standard libraries, etc.), Python packages that wrap Machine Learning (packages for NLP, Object Detection, etc.), Linux, AWS/C2S, Apache NiFi, Spark, pySpark, Hadoop, Kafka, ElasticSearch, Solr, Kibana, neo4J, MariaDB, Postgres, Docker, Puppet, and many others. Work on this program takes place in McLean, VA and in various field offices throughout Northern VA (we cannot support remote work) and requires a TS/SCI + Polygraph clearance (acceptable to this customer).

Apply for this position