Software Engineer - AI Infrastructure (SWE3) [D.26.0125]

Dover Networks LLC
Reisterstown, United States of America
8 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 285K

Job location

Reisterstown, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Cloud Engineering
Encodings
Distributed Systems
Python
Prometheus
Systems Integration
AI Infrastructure
Data Logging
High Performance Computing
System Availability
Grafana
AI Platforms
Kubernetes
BIG-IP Access Policy Manager (APM)

Job description

  • Design, implement, and optimize infrastructure for AI model inference at scale.
  • Lead the development and maintenance of production AI services and applications, including retrieval augmented generation (RAG), autonomous agents, and emerging technologies.
  • Serve as technical lead for AI infrastructure initiatives, coordinating work across integrated teams.
  • Conduct regular one-on-ones and provide coaching, feedback, and support for assigned team members.
  • Act as the team point of contact (POC) for contract administration functions.
  • Navigate ambiguity and define solutions for complex, underspecified systems and requirements.
  • Establish new technical policies, standards, and governance frameworks where gaps exist.
  • Drive adoption of new technologies and practices across engineering teams.
  • Implement and oversee monitoring, logging, and observability solutions for AI services.
  • Ensure high availability, reliability, performance, and security of AI platform components.
  • Communicate effectively with stakeholders at multiple organizational levels.

Requirements

  • Extensive experience designing, building, and operating large-scale production systems.
  • Deep expertise in systems integration across diverse technologies and platforms.
  • Hands-on experience with cloud engineering in AWS.
  • Advanced proficiency with Kubernetes administration and deployment patterns
  • Strong Python programming skills.
  • Experience implementing and scaling observability solutions (APM, OpenTelemetry, Grafana, Prometheus.)
  • Proven ability to lead technical initiatives and influence organizational change.
  • Experience developing technical policies and governance frameworks.
  • Excellent communication, stakeholder management, and leadership skills.
  • Ability to balance hands-on engineering with leadership and coordination responsibilities.

Nice to Haves:

  • Experience with AI inference serving technologies (vLLM, LiteLLM, etc.).
  • Previous experience with agentic frameworks (LangChain).
  • Knowledge of vector databases and embedding systems.
  • Experience with high-performance computing or distributed systems.
  • Track record of successfully driving technical and cultural change.

YOE Requirement: 12 yrs., B.S. in a technical discipline or 4 additional yrs. in place of B.S.

Benefits & conditions

Salary Range: $260k-$285k per year with an additional $65k-$71k in immediately vested company 401(k) contributions

Apply for this position