Senior Software Engineer - AI Infrastructure
NEURAL SOLUTIONS LLC
Columbia, United States of America
9 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 247KJob location
Columbia, United States of America
Tech stack
Artificial Intelligence
Amazon Web Services (AWS)
Cloud Engineering
Encodings
Distributed Systems
Python
Prometheus
Systems Integration
AI Infrastructure
Data Logging
High Performance Computing
System Availability
Grafana
AI Platforms
Kubernetes
BIG-IP Access Policy Manager (APM)
Job description
Join us in building the next generation of AI infrastructure that will power innovation across our organization. We're seeking a senior full-stack software engineer to support our AI infrastructure team in Columbia, MD. Responsibilities:
- Design, implement, and optimize infrastructure for AI model inference at scale.
- Lead the development and maintenance of production AI services and applications, including retrieval augmented generation (RAG), autonomous agents, and emerging technologies.
- Serve as technical lead for AI infrastructure initiatives, coordinating work across integrated teams.
- Conduct regular one-on-ones and provide coaching, feedback, and support for assigned team members.
- Navigate ambiguity and define solutions for complex, underspecified systems and requirements.
- Establish new technical policies, standards, and governance frameworks where gaps exist.
- Drive adoption of new technologies and practices across engineering teams.
- Implement and oversee monitoring, logging, and observability solutions for AI services.
- Ensure high availability, reliability, performance, and security of AI platform components.
- Communicate effectively with stakeholders at multiple organizational levels.
Requirements
- Extensive experience designing, building, and operating large-scale production systems.
- Deep expertise in systems integration across diverse technologies and platforms.
- Hands-on experience with cloud engineering in AWS.
- Advanced proficiency with Kubernetes administration and deployment patterns
- Strong Python programming skills.
- Experience implementing and scaling observability solutions (APM, OpenTelemetry, Grafana, Prometheus).
- Proven ability to lead technical initiatives and influence organizational change.
- Experience developing technical policies and governance frameworks.
- Excellent communication, stakeholder management, and leadership skills.
Nice to Have:
- Experience with AI inference serving technologies (vLLM, LiteLLM, etc.).
- Previous experience with agentic frameworks (LangChain).
- Knowledge of vector databases and embedding systems.
- Experience with high-performance computing or distributed systems.
- Track record of successfully driving technical and cultural change.
Experience Required: 12 years with Bachelor's degree in a technical field, or 16 years without degree