Senior AI Engineer - APM Features
Datadog
Paris, France
7 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Paris, France
Tech stack
A/B testing
Artificial Intelligence
Profiling
Cursor (Graphical User Interface Elements)
Distributed Systems
Datadog
Large Language Models
Build Management
Microservices
Job description
- Design and build AI-powered troubleshooting features for APM workflows using LLMs and agentic systems.
- Help users diagnose and resolve performance issues by synthesizing large volumes of observability data, including traces, metrics, and logs.
- Prototype, experiment, and iterate on AI-driven experiences, using evidence and user feedback to guide decisions and focus on real user value.
- Define inputs, outputs, and success criteria for LLM-based systems operating in evolving and sometimes ambiguous environments.
- Build agentic workflows with strong guardrails, balancing autonomy, safety, correctness, and reliability.
- Lead features end-to-end in collaboration with peers and partners, from problem discovery through production and iteration.
- Design and maintain evaluation loops, including offline evaluations, benchmarks, and A/B tests.
- Write and own production backend services, contributing to reliable, scalable systems.
Requirements
- A senior, product-minded engineer with experience shipping AI systems to production.
- Comfortable working in evolving problem spaces and proactively identifying meaningful opportunities to build.
- Hands-on experience with LLMs or agentic systems, including prompting, tooling, evaluation, and guardrails.
- Experience using AI coding tools such as Cursor, Claude Code, or similar, with the ability to reflect on what worked, what didnât, and why.
- A strong sense for correctness, failure modes, and how to measure and improve quality in AI systems.
- Comfortable experimenting, learning from outcomes, and iterating thoughtfully.
- Solid ML and applied science fundamentals, including experiment design and statistics.
Bonus points:
These are helpful but not required - we donât expect candidates to have experience with everything listed below.
- Exposure to agent frameworks, tool-use orchestration, retrieval-augmented generation (RAG), and indexing large-scale telemetry data.
- Familiarity with SLO/SLA practices and incident response.
- Hands-on experience with distributed tracing systems (OpenTelemetry, Datadog APM), profilers, or logs and metrics pipelines.
Distributed systems fundamentals and familiarity with observability concepts.
Benefits & conditions
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development, product training, and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks, our Internal panel discussions
- Free, global mental health benefits for employees and dependents age 6+
- Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.