LLM Developer

CAMPAIGNS & ELECTIONS MAGAZINE
Manhattan, United States of America
21 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 220K

Job location

Manhattan, United States of America

Tech stack

API
Artificial Intelligence
Python
Performance Tuning
Large Language Models
Prompt Engineering
Backend
AI Platforms
Low Latency
Machine Learning Operations
Api Design
GPT

Job description

As an LLM Developer, you will own the full lifecycle of LLM-powered systems - from model selection and prompt design to inference optimization and production deployment.

You will work on systems that must be fast, accurate, and context-aware in real time. Your work directly impacts user experience during live interviews.

You are expected to think deeply about model behavior, latency, cost, and output quality - and continuously improve all four., * Implement streaming responses, caching, and batching strategies

  • Improve token efficiency and context window management
  • Reduce cost through model routing and optimization techniques

Fine-Tuning & Model Customization

  • Fine-tune models for domain-specific use cases
  • Build and manage training datasets and evaluation pipelines
  • Apply techniques such as LoRA, QLoRA, and RLHF
  • Run experiments to compare model performance across configurations

Prompt Engineering & Quality Systems

  • Design system prompts and instruction architectures
  • Build evaluation frameworks for hallucination, accuracy, and relevance
  • Perform root-cause analysis of model failures
  • Continuously iterate on prompt and retrieval strategies

Evaluation & Monitoring

  • Build LLM evaluation pipelines and benchmarks
  • Monitor production metrics including latency, quality, and cost
  • Detect model drift and regressions
  • Optimize token usage and API efficiency

Safety & Reliability

  • Implement guardrails against prompt injection and unsafe outputs
  • Test adversarial inputs and edge cases
  • Ensure privacy-first handling of user data
  • Improve robustness of real-time AI responses

Requirements

  • 3+ years experience building LLM or AI systems in production
  • Strong experience with RAG systems and prompt engineering
  • Proficiency in Python and LLM frameworks (LangChain, LlamaIndex, or similar)
  • Strong understanding of transformer architectures
  • Experience with vector databases and embeddings pipelines
  • Experience building APIs or backend systems for AI services
  • Experience with evaluation and monitoring of ML systems, * Experience with real-time or low-latency LLM systems
  • Background in speech + multimodal AI systems
  • Experience with agentic workflows and tool-use LLMs
  • Familiarity with model compression and optimization
  • Experience with LLM safety, guardrails, and adversarial testing
  • Contributions to open-source AI or LLM projects
  • Startup or early-stage company experience, * Optional: GitHub, portfolio, or technical writing samples

About the company

LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted globally by over 1 million users. We build AI-powered systems that help candidates perform better in high-stakes career moments by providing real-time assistance during interviews and professional conversations., LockedIn AI is looking for a deeply technical LLM Developer to design, build, and optimize the large language model systems powering our real-time AI copilot used by over 1 million users. This is a core engineering role where you will directly shape how our AI listens, reasons, and responds during live interviews, coding assessments, and professional meetings. About LockedIn AI LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted globally by over 1 million users. We build AI-powered systems that help candidates perform better in high-stakes career moments by providing real-time assistance during interviews and professional conversations. Our platform sits at the intersection of: * Large language models * Real-time inference systems * Retrieval systems and embeddings * High-scale production infrastructure

Apply for this position