Cloud Data Engineer

GAP SOLUTIONS
Atlanta, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Atlanta, United States of America

Tech stack

Agile Methodologies
Artificial Intelligence
Airflow
Data analysis
Azure
Big Data
Cloud Computing Security
Cloud Database
Cloud Storage
Collaborative Software
Data Governance
ETL
Data Sharing
Relational Databases
Python
Machine Learning
Microsoft SQL Server
SQL Azure
NumPy
SQL Databases
Workflow Management Systems
Data Processing
Cloud Platform System
Azure
GIT
Pandas
Atlassian Tools
Azure
Software Version Control
Data Pipelines
Databricks

Job description

Position Objective: A key DMI objective is to "expand foundational infrastructure to provide scalable, flexible services for timely and appropriate access to actionable data in the public health ecosystem." Currently, public health programs operating across CDC have myriad investments in divergent and overlapping systems to collect, process, and analyze data to support public health decision making and administrative functions. Systems are of varying age, complexity, and quality and this creates a burden for public health partners to provide and use data, for programs to use their data, and for CDC to secure shared data with its partners and deidentified data with the public.

EDAV helps alleviate this problem by designing, developing, and operating shared, enterprise data services to help programs modernize and integrate these services with their existing and planned systems. However, EDAV needs to expand the quantity and quality of these services and assist programs to integrate their systems with EDAV to create new public health data products using the shared EDAV platform. Data products include data collections, storage, reports, dashboards, metadata collection, analytics (including artificial intelligence [AI]/machine learning [ML]), public use data, indicators, measures, and decision-making systems.

CDC's Center for Forecasting Outbreaks and Analytics (CFA) is tasked with collaborating with internal and external partners to track public health event disease outbreaks and forecast their directions. To do this, CFA needs to extend EDAV's capabilities to cloud spaces where it can collaborate with CDC and non-CDC groups to share data, develop machine learning models, exchange models and algorithms, and jointly author analytics and visualization products. Absent this, CFA cannot perform its mission.

Duties and Responsibilities:

  • Working with a multi-disciplinary team of scientists, data engineers, developers, and data consumers in a fast-paced, Agile environment

  • Monitor and optimize data pipelines for performance, scalability, and cost-effectiveness

  • Opportunity to sharpen skills in analytical exploration and data examination while support the assessment, design, developing, and maintenance of scalable platforms for the clients

Requirements

  • Bachelor's Degree required

  • 3+ years of experience with extract, transform, load (ETL) operations with a focus on Azure technologies

  • 2+ years of experience with source control and collaboration software, including Git or Atlassian tools

  • Knowledge of Azure Batch and its application in processing large data sets

  • Experience with SQL and relational databases (e.g., Azure SQL Database, SQL Server)

  • Experience with Python or R including experience with data manipulation libraries (e.g., Pandas, NumPy, Polars, Tidyverse)

  • Strong problem-solving skills and ability to work independently and in a team environment

  • Proficiency in Azure Data Factory and its components

Preferred Qualifications:

  • Experience with developing pipeline utilizing Azure Batch and Azure Data Factory

  • Familiarity with Apache Airflow or similar workflow orchestration tools

  • Experience with Azure Synapse Analytics, Azure Databricks, or Azure Blob Storage

  • Familiarity with cloud security best practices and data governance

  • Ability to quickly learn technical concepts and communicate with multiple functional groups

*This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required by this position.

To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed above are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

Apply for this position