Senior Data Engineer

CVS Health
Columbia, United States of America
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 222K

Job location

Remote
Columbia, United States of America

Tech stack

API
Agile Methodologies
Airflow
Amazon Web Services (AWS)
Business Analytics Applications
Data analysis
Azure
Bash
Big Data
Google BigQuery
Unix
Cloud Computing
Cloud Engineering
Cloud Storage
Software Quality
Information Systems
Continuous Integration
Information Engineering
Data Governance
ETL
Data Security
Data Structures
Data Warehousing
DevOps
Data Flow Control
Github
Identity and Access Management
Java Web Services
Python
Machine Learning
Metadata
NoSQL
Query Optimization
Cloud Services
Service-Oriented Architecture
SQL Databases
Data Streaming
Systems Integration
Unix Commands
Unstructured Data
Data Logging
Data Processing
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Cloud Platform System
Cloud Monitoring
Build Server
GIT
Data Lake
Kubernetes
Information Technology
Data Analytics
Kafka
Spark Streaming
Data Management
Video Streaming
Api Design
Software Version Control
Data Pipelines
Serverless Computing
Jenkins
Programming Languages
Microservices

Job description

If you're eager to make a real impact in the health care industry through your own meaningful contributions, explore a role in technology with CVS Health. Our journey calls for technical innovators and data visionaries: come help us pave the way.

At CVS Health, we possess an extensive repository of healthcare data that spans over 150 million individuals, providing an unparalleled foundation for ambitious Data Engineers. In this role, you will engage with complex business challenges, harnessing modern tools and technologies to securely store, process, transform, and enrich terabyte to petabyte scale healthcare data. Your work will underpin data-driven business decisions and contribute to our mission of delivering industry-best data products / software with a customer-first mindset and team-oriented approach.

As a Senior Data Engineer, you will be instrumental in designing, developing, and maintaining optimal data pipelines to assemble large and intricate datasets, catering to the business requirements of various CVS lines of business. Collaborating closely with teams, you will craft tools to provide actionable insights and integrate them with consumer touchpoints.

In this role, you will:

  • Architect and develop robust, scalable ETL/ELT pipelines using Cloud Dataflow, Cloud composer (Airflow), and Pub/Sub for both batch and streaming use cases. Leverage BigQuery as the central data warehouse and design integrations with other GCP services (e.g., Cloud storage, Cloud functions).

  • Build and optimize analytical data models in BigQuery. Implement partitioning, clustering, and materialized views for performance and cost efficiency. Ensure compliance with data governance, access controls, and IAM best practices.

  • Develop integrations with external systems (APIs, flat files etc.) using GCP-native or hybrid approaches. Utiilize tools like Dataflow or custom Python/Java services on Cloud Functions or Cloud Run to handle transformations and ingestion logic.

  • Build automated CI/CD pipeline using Cloud Build, GitHub Actions, or Jenkins for deploying data pipeline code and workflows. Set up observability using Cloud Monitoring, Cloud Logging, and Error Reporting to ensure pipeline reliability.

  • Lead architectural decisions for data platforms and mentor junior engineers on cloud-native data engineering patterns. Promote best practices for code quality, version control, cost optimization, and data security in a GCP environment. Drive initiatives around data democratization, including building reusable datasets and data catalogs via Datapelx or Data Catalog.

As leaders in healthcare, our analytics and engineering teams deliver innovative solutions to business problems by collaborating with cross-functional teams in a dynamic and agile environment. You will be part of a team that values collaboration and encourages innovative thinking at all levels. You will be intellectually challenged to solve problems associated with large scale complex, structured and unstructured data, that will allow you to grow your technical skills and engineering expertise.

Requirements

  • 3+ years of experience with SQL, NoSQL

  • 3+ years of experience with Python (or a comparable scripting language)

  • 3+ years of experience with Data warehouses (such as data modeling and technical architectures) and infrastructure components

  • 3+ years of experience with ETL/ELT, and building high-volume data pipelines

  • 3+ years of experience with reporting/analytic tools

  • 3+ years of experience with Query optimization, data structures, transformation, metadata, dependency, and workload management

  • 3+ years of experience with Big data and cloud architecture

  • 3+ years of hands-on experience building modern data pipelines within a major cloud platform (GCP, AWS, Azure)

  • 3+ years of experience with deployment/scaling of apps on containerized environment (i.e. Kubernetes, AKS)

  • 3+ years of experience with real-time and streaming technology (i.e. Azure Event Hubs, Azure Functions, Kafka, Spark Streaming)

  • 1+ year(s) of soliciting complex requirements and managing relationships with key stakeholders

  • 1+ year(s) of experience independently managing deliverables

Preferred Qualifications

  • Experience in designing and building data engineering solutions in cloud environments (preferably GCP)

  • Experience with Git, CI/CD pipeline, and other DevOps principles/best practices

  • Experience with bash shell scripts, UNIX utilities & UNIX Commands

  • Ability to leverage multiple tools and programming languages to analyze and manipulate data sets from disparate data sources

  • Knowledge of API development

  • Experience with complex systems and solving challenging analytical problems

  • Strong collaboration and communication skills within and across teams

  • Knowledge of data visualization and reporting

  • Experience with schema design and dimensional data modeling

  • Google Professional Data Engineer Certification

  • Knowledge of microservices and SOA

  • Formal SAFe and/or agile experience. Previous healthcare experience and domain knowledge

  • Experience designing, building, and maintaining data processing systems

  • Experience architecting and building data warehouse and data lakes

Education

  • Bachelor's Degree or equivalent work experience in Computer Science, Information Systems, Data Engineering, Data Analytics, Machine Learning, or related field required

  • Master's Degree preferred

Benefits & conditions

This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.

Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.

Great benefits for great people

We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.

This full-time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well-being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.

Apply for this position