Senior Software Engineer & LLM Code Trainer
CHATGPT LLC
San Francisco, United States of America
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
San Francisco, United States of America
Tech stack
Java
JavaScript
API
Artificial Intelligence
Automation of Tests
Bash
C++
Command-Line Interface
Code Generation
Information Engineering
Software Debugging
Linux
DevOps
Python
Software Engineering
SQL Databases
TypeScript
Large Language Models
Backend
GIT
Containerization
Free and Open-Source Software
Code Restructuring
Docker
Go
Programming Languages
Job description
The company is looking for a Senior Software Engineer to contribute to the development and evaluation of AI training data for a leading expert human data platform for AI agents and LLMs. This unique role sits at the intersection of software engineering and artificial intelligence, helping companies build better, safer, and more capable models., * Create and review coding tasks based on real-world software engineering scenarios (debugging, refactoring, code generation, API usage, automated tests, performance, security, edge cases).
- Write high-quality reference solutions that are correct, clear, testable, and aligned with requirements.
- Evaluate AI-generated code and responses using structured rubrics (correctness, clarity, security, performance, maintainability, instruction-following).
- Compare multiple model responses, select the strongest answer, and justify decisions with technical reasoning.
- Identify bugs, hallucinated APIs, missing edge cases, weak explanations, and poor engineering decisions in AI outputs.
- Work with terminal-based development workflows (testing, debugging, managing dependencies, navigating repositories).
- Follow detailed guidelines consistently and participate in calibration activities to ensure high-quality evaluations.
Requirements
- Experience: 5+ years of professional software engineering experience in a backend, fullstack, or systems role.
- Programming Languages: Strong proficiency in at least one core language (Python, JavaScript/TypeScript, Go, Java, C++, or SQL).
- Tools: Hands-on experience with Terminal-Bench, Git, command line/terminal, and common development workflows.
- Evaluation Skills: Ability to evaluate code critically regarding design, security, and maintainability.
- AI Experience: Prior experience in AI data production, RLHF, data annotation, or LLM evaluation projects preferred.
- Communication: Excellent written and verbal communication skills in English.
- Work Style: Ability to work independently in a remote, asynchronous, fast-paced environment with high attention to detail.
Nice-to-Have
- Experience with Python-heavy workflows, automated testing frameworks, Docker, Linux, bash, or containerized environments.
- Experience with repo-level code reasoning, large codebases, or open-source contributions.
- Background in backend systems, data engineering, DevOps, infrastructure, security, or large codebases.