TELECOMMUTE Data Infrastructure Engineer
Role details
Job location
Tech stack
Job description
We are seeking a Data Infrastructure Engineer to build and operate the data platform that powers AI/ML analytics modules. You will design and implement scalable data ingestion pipelines, robust ETL/ELT, and a modern data lake / delta lake (lakehouse) on AWS. You'll also establish a managed metadata repository and governance layers (catalog, lineage, quality, access controls) and deliver automated cloud provisioning plus CI/CD for data pipelines to enable reliable, repeatable deployments across environments., * Design and implement batch and streaming ingestion from APIs, relational databases, file drops, event streams, and external partners.
- Build and optimize ETL/ELT pipelines to produce curated, analytics-ready datasets for reporting and ML consumption.
- Implement incremental processing patterns, change data capture (CDC) approaches where appropriate, and data contract standards.
Deliver a Modern Lakehouse (Data Lake / Delta Lake)
- Build and manage a scalable lakehouse on AWS object storage (e.g., S3) using open table/file formats and delta/lakehouse concepts (e.g., ACID tables, schema evolution, time travel patterns).
- Optimize performance and cost through partitioning, compaction, lifecycle policies, and efficient compute/storage usage.
- Establish environment standards for dev/test/prod and consistent promotion across stages.
Metadata, Governance, Lineage & Quality (Trust Layer)
- Implement a managed metadata repository for dataset cataloging, ownership, glossary/definitions, tagging, and discoverability.
- Enable end-to-end lineage (source transformations consumption) to support auditability and impact analysis.
- Implement governance controls including policy-based access, data classification, retention, and secure data handling.
- Build operational data quality checks (freshness, completeness, validity, anomaly detection) and publish SLAs/SLOs.
AWS Automation + CI/CD for Data Pipelines
- Implement automated cloud provisioning in AWS using Infrastructure as Code (IaC) for consistent environments and secure-by-default baselines.
- Build and enhance CI/CD for data pipelines, including automated tests, validation gates, promotion workflows, and rollback strategies.
- Improve observability with metrics/logs/alerts, dashboards, runbooks, and incident response readiness.
Cross-Team Collaboration & Documentation
- Work closely with engineering, security, networking, and application teams to support mission needs and delivery timelines.
- Maintain high-quality engineering documentation including SOPs, system diagrams, and secure configuration baselines.
- Summarize and present findings and recommendations-both written and verbal-to technical and non-technical stakeholders.
Requirements
- Bachelor's degree in Engineering, IT, Computer Science, or related field (or equivalent experience).
- Zero(0) to Two(2) Years of experience.
- Experience building production data pipelines and/or data platforms.
- Knowledge in implementing data ingestion and ETL/ELT workflows, including data modeling and transformation best practices.
- Knowledge in building a data lake / delta lake (lakehouse) on AWS (or equivalent cloud) using object storage and modern table formats/patterns.
- Proficiency in SQL and one programming language commonly used for data engineering (Python preferred; Scala/Java acceptable).
- Knowledge with metadata management and governance: cataloging, lineage, ownership, access controls, classification and policy enforcement.
- Knowledge in implementing automated AWS provisioning using IaC and operating across multiple environments.
- Proven experience developing RAG applications
- Solid security fundamentals: IAM/least privilege, encryption, secrets management, secure SDLC practices.
- Must be able to OBTAIN and MAINTAIN a Federal or DoD "PUBLIC TRUST"; candidates must obtain approved adjudication of their PUBLIC TRUST prior to onboarding with Guidehouse. Candidates with an ACTIVE PUBLIC TRUST or SUITABILITY are preferred.
What Would Be Nice To Have:
- Hands-on experience with Databricks
- Experience in operating CI/CD pipelines for data workflows (testing, packaging, deployment automation, environment promotion).
- Hands-on experience utilizing modern DevOps practices, including tools like Git, Terraform, Jenkins, AWS CodePipeline, and Docker.
- Experience utilizing AI-assisted coding tools (e.g., GitHub Copilot, ChatGPT, Cursor, Kiro) to safely accelerate implementation while maintaining strict code quality through testing, code reviews, and security practices.
- Knowledge graph and Graph RAG experience, including:
- Graph modeling and ontology/taxonomy alignment
- Entity resolution and relationship extraction
- Hybrid retrieval approaches combining graph traversal with semantic/vector search to improve grounding and explainability
Benefits & conditions
The annual salary range for this position is $65,000.00-$108,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs.
What We Offer:
Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace.
Benefits include:
- Medical, Rx, Dental & Vision Insurance
- Personal and Family Sick Time & Company Paid Holidays
- Parental Leave
- 401(k) Retirement Plan
- Group Term Life and Travel Assistance
- Voluntary Life and AD&D Insurance
- Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts
- Transit and Parking Commuter Benefits
- Short-Term & Long-Term Disability
- Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities
- Employee Referral Program
- Corporate Sponsored Events & Community Outreach
- Care.com annual membership
- Employee Assistance Program
- Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.)
- Position may be eligible for a discretionary variable incentive bonus
About Guidehouse
Guidehouse is an Equal Opportunity Employer-Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation.
Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco.