Data Engineer

Exel Inc.

Rockville, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Rockville, United States of America

Tech stack

API

Amazon Web Services (AWS)

Business Analytics Applications

Data analysis

Audit Trail

Automation of Tests

Bioinformatics

Health Informatics

Clinical Data Management

Cloud Computing

Cloud Database

Cloud Storage

Software Documentation

Code Review

Information Systems

Databases

Data Validation

Information Engineering

Data Governance

Data Infrastructure

Data Integration

ETL

Data Mapping

Data Security

Data Structures

Data Systems

Data Warehousing

Relational Databases

Database Theory

Python

PostgreSQL

Metadata

Meta-Data Management

Operational Data Store

DataOps

Scientific Computating

Software Deployment

Software Engineering

SQL Databases

Data Streaming

Systems Integration

Unstructured Data

Web Applications

Data Processing

Scripting (Bash/Python/Go/Ruby)

Data Ingestion

Fast Healthcare Interoperability Resources

Snowflake

Spark

Jupyter

GIT

Data Lake

PySpark

Information Technology

Data Lineage

Health Level Seven International

Data Management

Tools for Reporting

Api Design

Streamlit Framework

Data Pipelines

Databricks

Job description

Data Pipeline Development: Design, build, test, and maintain data pipelines to ingest, transform, harmonize, and integrate diverse biomedical and research data sources, including clinical, genomic, experimental, imaging, biospecimen, operational, and other scientific datasets. Develop reusable transformation logic and curated datasets that support analytics, reporting, dashboards, applications, APIs, and downstream research workflows.
Data Integration and Lifecycle Support: Support the full research data lifecycle by enabling reliable data movement from source systems and storage environments into structured, analysis-ready formats. Assist with data ingestion, curation, metadata capture, data refreshes, source-to-target mapping, schema management, and long-term maintainability of data products and workflows.
Collaboration: Work closely with data scientists, bioinformaticians, researchers, application developers, project managers, and government stakeholders to gather requirements and deliver practical data solutions. Translate scientific and operational data needs into technical specifications, data models, transformation logic, and reusable datasets that accelerate biomedical research workflows and support informed decision-making.
Quality & Governance: Implement data validation checks, reconciliation routines, testing practices, and monitoring processes to ensure data accuracy, completeness, consistency, and integrity. Follow data governance and security best practices, including documentation of transformations, lineage, assumptions, access requirements, and compliance considerations related to sensitive, regulated, de-identified, or access-controlled research data.
Dashboarding & Integration: Create or support interactive dashboards, reporting layers, APIs, and application-ready datasets that allow researchers and stakeholders to visualize, explore, and analyze data. Support integration between data pipelines, databases, cloud platforms, analytics environments, and approved application platforms to enable scalable and secure data access.
Operational Support and Modernization: Troubleshoot data pipeline failures, source system inconsistencies, data quality issues, schema changes, access issues, and performance bottlenecks. Contribute to modernization efforts by improving automation, documentation, scalability, reproducibility, and platform readiness across environments.

Requirements

The ideal candidate will have strong experience with Python, SQL, ETL/ELT development, data modeling, data quality practices, and research data lifecycle support. This role requires the ability to work with complex multi-source datasets, support analytics and application-facing data products, and contribute to scalable, well-governed data solutions that align with the Data Science Client Services branch priorities for data accessibility, interoperability, reproducibility, modernization, and secure research enablement., * Education & Background: Bachelor's degree in Computer Science , Data Science, Bioinformatics, Biomedical Informatics, Information Systems, Engineering, or a related field, or equivalent practical experience. Proven experience as a Data Engineer, Analytics Engineer, Data Integration Developer, Bioinformatics Engineer, or similar data-intensive role, preferably supporting analytics, biomedical research, healthcare, scientific computing, or research data teams.

Data Engineering Expertise: Strong proficiency in Python and SQL for data manipulation, transformation, scripting, automation, and analysis. Hands-on experience building ETL/ELT processes and data pipelines to support large, complex, multi-source datasets. Familiarity with scalable data processing approaches, including Spark/ PySpark or similar frameworks, for high-volume or complex transformations is required.
Analytical Skills: Solid understanding of data modeling, relational databases, data warehouses, data lakes, metadata, and database concepts. Ability to work with complex, multi-modal datasets, including structured, semi-structured, and unstructured data, and optimize data workflows for reliability, performance, usability, and long-term maintainability.
Best Practices: Knowledge of software engineering and data engineering best practices, including version control using Git, code review, automated testing, documentation, peer review, and change management. Experience ensuring data quality and using lineage, provenance tracking, audit trails, or documentation practices to support transparency, reproducibility, and data flow traceability.
Collaboration & Communication: Excellent problem-solving skills and the ability to communicate effectively with both technical and non-technical stakeholders. Comfortable working in an interdisciplinary environment with biomedical researchers, analysts, developers, and project teams. Capable of translating domain-specific needs into technical solutions and explaining technical risks, limitations, and dependencies in clear stakeholder-focused language.
Domain Alignment: Strong interest in biomedical science, clinical research, healthcare data, and scientific discovery. Ability to quickly learn domain-specific concepts, data structures, terminology, and research workflows. Demonstrated awareness of sensitive data handling, privacy, access control, data governance, and regulatory or compliance expectations associated with biomedical and clinical research data.

Preferred Qualifications (Plus Skills)

Platform-as-a-Service and Data Platform Experience: Hands-on experience building data solutions in modern data platforms or platform-as-a-service environments such as Snowflake, Databricks, Palantir, cloud data warehouses, data lakes, or similar platforms. Experience supporting integrations across databases, cloud storage, APIs, analytics platforms, dashboards, and application environments is preferred.
Research and Application Enablement: Experience preparing curated datasets for dashboards, APIs, web applications, reporting tools, notebooks, or scientific computing environments. Familiarity with research-facing tools and platforms such as Posit Connect, R/Shiny, Streamlit , Jupyter, Galaxy, Code Ocean, or similar analytics and application delivery environments is a plus.
Cloud, Storage, and Automation Experience: Experience working with cloud or hybrid data environments, object storage such as S3, relational databases such as Postgres, automated data refreshes, scheduled jobs, API-based integrations, and secure data movement across controlled environments.
Biomedical Domain Knowledge: Previous experience in biomedical research, healthcare analytics, clinical research, public health, pharmaceutical research and development, or scientific data management. Familiarity with biomedical data standards or datasets, such as clinical trial data, clinical imaging, laboratory data, biospecimen data, transcriptomics/genomic data, HL7/FHIR, CDISC, OMOP, or related standards, and an understanding of the scientific research process will help you excel in this role.
Governance and Reproducibility: Experience supporting data governance, metadata management, data lineage, reproducible workflows, documentation standards, and secure handling of de-identified, sensitive, or access-controlled research datasets.

Disclaimer: The above description is meant to illustrate the general nature of work and level of effort being performed by individuals assigned to this position or job description. This is not restricted as a complete list of all skills, responsibilities, duties, and/or assignments required. Individuals may be required to perform duties outside of their position, job description or responsibilities as needed.

Benefits & conditions

Benefits We Offer:

100% Medical, Dental & Vision Coverage for Employees
Paid Time Off and Paid Holidays
401K match up to 5%
Educational Benefits for Career Growth
Employee Referral Bonus
Flexible Spending Accounts:

Healthcare (FSA)
Parking Reimbursement Account (PRK)
Dependent Care Assistant Program (DCAP)
Transportation Reimbursement Account (TRN), This role has a market-competitive salary with an anticipated base compensation range listed below. Actual salaries will vary depending on a candidate's experience, qualifications, skills, and location.

About the company

Axle is a bioscience and information technology company that offers advancements in translational research, biomedical informatics, and data science applications to research centers and healthcare organizations nationally and abroad. With experts in biomedical science, software engineering, and program management, we focus on developing and applying research tools and techniques to empower decision-making and accelerate research discoveries. We work with some of the top research organizations and facilities in the country including multiple institutes at the National Institutes of Health (NIH).

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all