TELECOMMUTE Data Infrastructure Engineer

Guidehouse Inc.
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 108K

Job location

Remote

Tech stack

Java
API
Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Automation of Tests
Software Quality
Code Review
Continuous Integration
Information Engineering
Data Infrastructure
ETL
Data Security
Relational Databases
Cursor (Graphical User Interface Elements)
DevOps
Graph Database
Identity and Access Management
Python
Key Management
Metadata
Meta-Data Management
Metadata Repositories
Operational Data Store
Operational Databases
Systems Development Life Cycle
Role-Based Access Control
Runbook
Search Technologies
Simple Data Format
SQL Databases
Data Classification
Data Ingestion
GitHub Copilot
Delivery Pipeline
Change Data Capture
Infrastructure as Code (IaC)
GIT
Data Lake
Information Technology
Deployment Automation
Data Management
Terraform
GPT
Data Pipelines
Docker
Jenkins
Databricks

Job description

We are seeking a Data Infrastructure Engineer to build and operate the data platform that powers AI/ML analytics modules. You will design and implement scalable data ingestion pipelines, robust ETL/ELT, and a modern data lake / delta lake (lakehouse) on AWS. You'll also establish a managed metadata repository and governance layers (catalog, lineage, quality, access controls) and deliver automated cloud provisioning plus CI/CD for data pipelines to enable reliable, repeatable deployments across environments., * Design and implement batch and streaming ingestion from APIs, relational databases, file drops, event streams, and external partners.

  • Build and optimize ETL/ELT pipelines to produce curated, analytics-ready datasets for reporting and ML consumption.
  • Implement incremental processing patterns, change data capture (CDC) approaches where appropriate, and data contract standards.

Deliver a Modern Lakehouse (Data Lake / Delta Lake)

  • Build and manage a scalable lakehouse on AWS object storage (e.g., S3) using open table/file formats and delta/lakehouse concepts (e.g., ACID tables, schema evolution, time travel patterns).
  • Optimize performance and cost through partitioning, compaction, lifecycle policies, and efficient compute/storage usage.
  • Establish environment standards for dev/test/prod and consistent promotion across stages.

Metadata, Governance, Lineage & Quality (Trust Layer)

  • Implement a managed metadata repository for dataset cataloging, ownership, glossary/definitions, tagging, and discoverability.
  • Enable end-to-end lineage (source transformations consumption) to support auditability and impact analysis.
  • Implement governance controls including policy-based access, data classification, retention, and secure data handling.
  • Build operational data quality checks (freshness, completeness, validity, anomaly detection) and publish SLAs/SLOs.

AWS Automation + CI/CD for Data Pipelines

  • Implement automated cloud provisioning in AWS using Infrastructure as Code (IaC) for consistent environments and secure-by-default baselines.
  • Build and enhance CI/CD for data pipelines, including automated tests, validation gates, promotion workflows, and rollback strategies.
  • Improve observability with metrics/logs/alerts, dashboards, runbooks, and incident response readiness.

Cross-Team Collaboration & Documentation

  • Work closely with engineering, security, networking, and application teams to support mission needs and delivery timelines.
  • Maintain high-quality engineering documentation including SOPs, system diagrams, and secure configuration baselines.
  • Summarize and present findings and recommendations-both written and verbal-to technical and non-technical stakeholders.

Requirements

  • Bachelor's degree in Engineering, IT, Computer Science, or related field (or equivalent experience).
  • Zero(0) to Two(2) Years of experience.
  • Experience building production data pipelines and/or data platforms.
  • Knowledge in implementing data ingestion and ETL/ELT workflows, including data modeling and transformation best practices.
  • Knowledge in building a data lake / delta lake (lakehouse) on AWS (or equivalent cloud) using object storage and modern table formats/patterns.
  • Proficiency in SQL and one programming language commonly used for data engineering (Python preferred; Scala/Java acceptable).
  • Knowledge with metadata management and governance: cataloging, lineage, ownership, access controls, classification and policy enforcement.
  • Knowledge in implementing automated AWS provisioning using IaC and operating across multiple environments.
  • Proven experience developing RAG applications
  • Solid security fundamentals: IAM/least privilege, encryption, secrets management, secure SDLC practices.
  • Must be able to OBTAIN and MAINTAIN a Federal or DoD "PUBLIC TRUST"; candidates must obtain approved adjudication of their PUBLIC TRUST prior to onboarding with Guidehouse. Candidates with an ACTIVE PUBLIC TRUST or SUITABILITY are preferred.

What Would Be Nice To Have:

  • Hands-on experience with Databricks
  • Experience in operating CI/CD pipelines for data workflows (testing, packaging, deployment automation, environment promotion).
  • Hands-on experience utilizing modern DevOps practices, including tools like Git, Terraform, Jenkins, AWS CodePipeline, and Docker.
  • Experience utilizing AI-assisted coding tools (e.g., GitHub Copilot, ChatGPT, Cursor, Kiro) to safely accelerate implementation while maintaining strict code quality through testing, code reviews, and security practices.
  • Knowledge graph and Graph RAG experience, including:
  • Graph modeling and ontology/taxonomy alignment
  • Entity resolution and relationship extraction
  • Hybrid retrieval approaches combining graph traversal with semantic/vector search to improve grounding and explainability

Benefits & conditions

The annual salary range for this position is $65,000.00-$108,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs.

What We Offer:

Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace.

Benefits include:

  • Medical, Rx, Dental & Vision Insurance
  • Personal and Family Sick Time & Company Paid Holidays
  • Parental Leave
  • 401(k) Retirement Plan
  • Group Term Life and Travel Assistance
  • Voluntary Life and AD&D Insurance
  • Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts
  • Transit and Parking Commuter Benefits
  • Short-Term & Long-Term Disability
  • Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities
  • Employee Referral Program
  • Corporate Sponsored Events & Community Outreach
  • Care.com annual membership
  • Employee Assistance Program
  • Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.)
  • Position may be eligible for a discretionary variable incentive bonus

About Guidehouse

Guidehouse is an Equal Opportunity Employer-Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation.

Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco.

Apply for this position