Backend Infrastructure & Agentic AI Platforms Engineer

Blu Omega
McLean, United States of America
14 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 180K

Job location

Remote
McLean, United States of America

Tech stack

API
Artificial Intelligence
User Authentication
Azure
Encodings
Continuous Integration
Software Debugging
Software Design Patterns
Distributed Systems
Python
OAuth
OpenID
Azure
Search Technologies
Software Engineering
Management of Software Versions
Data Logging
Data Processing
Data Storage Technologies
Large Language Models
Backend
Containerization
AI Platforms
Infrastructure Automation Frameworks
Deployment Automation
Virtual Agents
Data Pipelines
Dynatrace
Api Management
Serverless Computing

Job description

Blu Omega is seeking a Backend Infrastructure & Agentic AI Platforms Engineer to support a federal program focused on next-generation autonomous AI capabilities. This role operates within a remote environment and is responsible for building and maintaining backend infrastructure and agentic AI systems supporting ARPA-H's mission. The position requires experience supporting backend infrastructure, AI platform engineering, and distributed systems within a mission-driven environment., * Own the end-to-end backend infrastructure for GRACE on Microsoft Azure, including Azure Functions, API Management, Container Apps, and Azure OpenAI Service

  • Manage data storage, retrieval pipelines, vector databases, and document indexing for internal knowledge search
  • Implement and maintain infrastructure as code for all environments
  • Develop and manage CI/CD pipelines, deployment automation, and release processes
  • Ensure monitoring, alerting, logging, distributed tracing, and incident response are in place for all systems
  • Manage secrets, API keys, and credential rotation
  • Track and optimize cost and token economics across LLM providers
  • Develop and maintain backend implementation of MCP, including server hosting, tool registration, and versioning
  • Design and evolve communication patterns for agent interoperability and external integrations
  • Build and operate RAG pipelines for document ingestion, embedding, and semantic search
  • Implement fallback, retry, and degradation patterns for AI service dependencies
  • Manage tool-calling infrastructure, including registration, execution, and observability
  • Build and maintain observability for agent workflows, including latency, throughput, and error rates
  • Implement evaluation pipelines for safety, regression, and grounding assessment
  • Define and enforce system-level SLOs and alerting procedures
  • Establish and improve coding standards, design reviews, and testing practices
  • Mentor team members and communicate technical decisions clearly
  • Ensure privacy, security, and compliance in all systems and data handling Required

Requirements

  • 7+ years of professional software engineering experience building and operating production systems
  • Proven experience in high-velocity environments with end-to-end product ownership
  • Strong proficiency in Python and at least one other backend language
  • Experience with distributed systems, APIs, data pipelines, and software design patterns
  • Hands-on experience with Microsoft Azure: Azure Functions, API Management, Container Apps, and Azure OpenAI Service
  • Experience with containerization, CI/CD, and infrastructure as code
  • Knowledge of authentication and identity systems (OAuth2, OIDC, Azure Entra ID)
  • Demonstrated ability to own production systems, including on-call support and incident debugging, * Experience with AI/LLM platform engineering and orchestration
  • Familiarity w:ith vector search, RAG pipelines, and semantic search infrastructure
  • Strong understanding of security, privacy, and compliance standards

Benefits & conditions

  • Salary Range: $100,000 - $180,000

Final compensation is based on technical skills, experience, and education.

Blu Omega Benefits & Perks:

  • Medical, Dental, and Vision coverage through national providers
  • 401(k) with company match (eligible after 6 months; vesting applies)
  • Company-paid Life and AD&D insurance with additional voluntary options
  • Short-term and long-term disability options
  • Employee Assistance Program (EAP) with 24/7 confidential support and mental health resources
  • Telehealth and virtual care options
  • Pet insurance, legal services, and identity theft protection options
  • Paid Time Off (PTO) and 11 paid federal holidays
  • Access to wellness programs, discounts, and lifestyle benefits

About the company

Blu Omega is a Woman-Owned Small Business (WOSB) delivering technology and cybersecurity solutions to federal agencies and enterprise clients nationwide. Headquartered in Ashburn, VA, we support mission-critical programs across civilian and defense sectors, including health, national security, and regulatory environments. We partner with government agencies and large integrators to provide expertise in cybersecurity operations, cloud and infrastructure modernization, data and analytics, and enterprise IT support. Our teams are experienced operating within federal contracting environments, supporting task orders, recompetes, and programs requiring cleared personnel and compliant delivery.

Apply for this position