Data Engineer- Full Time Opportunity, NOT C2C

Codoxo, Inc.
Duluth, United States of America
2 days ago

Role details

Contract type
Internship / Graduate position
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Junior

Job location

Remote
Duluth, United States of America

Tech stack

Clean Code Principles
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Business Analytics Applications
Cloud Computing
Code Coverage
Software Quality
Information Systems
Data Validation
Information Engineering
Data Governance
Data Infrastructure
Data Integration
ETL
Data Security
Data Warehousing
Relational Databases
Dimensional Modeling
Distributed Computing Environment
Document-Oriented Databases
Identity and Access Management
Python
PostgreSQL
Linux System Administration
Machine Learning
Automation of Marketing
Shell Script
Software Engineering
SQL Databases
Database Optimization
Spark
Database Performance
GIT
PySpark
Information Technology
Amazon Web Services (AWS)
Machine Learning Operations
Software Version Control
Data Pipelines

Job description

The Data Engineer supports the design, development, and maintenance of scalable data pipelines that power analytics, reporting, and machine learning initiatives. Working under the guidance of senior engineers, this role contributes to building reliable ETL workflows, optimizing database performance, and integrating structured and unstructured data sources.

This position partners closely with data scientists, analysts, and cross-functional stakeholders to ensure timely, accurate, and secure data delivery. By strengthening foundational data infrastructure, the Junior Data Engineer helps advance analytics maturity, enable AI initiatives, and promote data-driven decision-making across the organization. The role consistently leverages AI tools to enhance productivity, code quality, and solution effectiveness., * Assist in designing, building, and maintaining scalable ETL/ELT data pipelines.

  • Develop and optimize batch and streaming workflows using tools such as AWS Glue, Spark, and Airflow.
  • Support data integration across multiple structured and unstructured data sources.
  • Write clean, efficient, and maintainable code in Python, PySpark and SQL.
  • Monitor, troubleshoot, and improve pipeline reliability and performance.
  • Optimize database performance, particularly in PostgreSQL and cloud-based environments.
  • Maintain and support AWS-based infrastructure (EC2, S3, Glue, etc.).
  • Implement data validation, quality checks, and monitoring processes.
  • Ensure compliance with data governance, security, and regulatory standards.
  • Collaborate with data scientists and analysts to translate data requirements into scalable engineering solutions.
  • Document data flows, architecture decisions, and technical processes.
  • Use AI-assisted development tools to improve speed, testing coverage, and code quality.

Requirements

Do you have experience in Python?, Do you have a Bachelor's degree?, * Bachelor's degree in Computer Science, Data Engineering, Information Systems, or a related technical field (or equivalent practical experience).

  • 2+ years of experience in data engineering, software engineering, or related technical roles (internships included).
  • Proficiency in Python, PySpark and SQL.
  • Familiarity with ETL/ELT concepts and data pipeline architecture.
  • Experience working with relational databases such as PostgreSQL.
  • Basic understanding of cloud computing concepts, preferably AWS.
  • Exposure to distributed data processing frameworks such as Spark.
  • Experience working in Linux environments and basic shell scripting.
  • Strong analytical and problem-solving skills.
  • Ability to work collaboratively in a team environment under mentorship.
  • Strong written and verbal communication skills., * Experience working with medical claims data strongly preferred.
  • Hands-on experience with AWS services such as EC2, S3, Glue, and IAM.
  • Experience with workflow orchestration tools such as Apache Airflow.
  • Exposure to data warehousing concepts and dimensional modeling.
  • Familiarity with CI/CD pipelines and version control (e.g., Git).
  • Understanding of data security, governance, and compliance best practices.
  • Experience supporting machine learning pipelines or analytics platforms.
  • Demonstrated use of AI tools (e.g., code assistants, automation platforms) to improve development efficiency.
  • Physical Requirements: Work is performed in an office environment (either in our office or work-from home) and requires the ability to work on a computer, operate standard office equipment, and work at a desk.

Benefits & conditions

Pulled from the full job description

  • Health insurance

  • 401(k) matching

  • Vision insurance

  • Dental insurance

  • Unlimited paid time off

  • Career development plan, Accessibility Notice: If you need reasonable accommodation for any part of the employment process due to a physical or mental disability, please send an email to careers@codoxo.com with the subject Benefits for You

  • Health, Dental, and Vision insurance with 100% employee premium coverage (Starts Day 1)

  • Unlimited PTO

  • Annual Professional Development stipend

  • Annual home office stipend

  • 401K Match (after 90 days)

About the company

This is for a Full Time Role with Codoxo, NOT C2C Of the $3.8T we spend on healthcare in the United States annually, about a third of it is estimated to be lost due to waste, fraud and abuse. Codoxo is the premier provider of artificial intelligence-driven solutions and services that help healthcare companies and agencies proactively detect and reduce risks from fraud, waste, and abuse and ensure payment integrity. Codoxo helps clients manage costs across network management, clinical care, provider coding and billing, payment integrity, and special investigation units. Our software-as-a service applications are built on our proven Forensic AI Engine, which uses patented AI-based technology to identify problems and suspicious behavior far faster and earlier than traditional techniques. We are venture backed by some of the top investors in the country, with strong financials, and remain one of the fastest growing healthcare AI companies in the industry.

Apply for this position