Staff Software Engineer

Community Of
Municipality of Madrid, Spain
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Municipality of Madrid, Spain

Tech stack

Artificial Intelligence
Cloud Computing
Databases
Distributed Systems
Load Testing
MySQL
Ruby on Rails
Reliability Engineering
Datadog
Performance Testing
Low Latency
Kafka
GraphQL
Vertica

Requirements

Overview Your role as a Staff Software Engineer on Factorial's DX and Performance team focuses on performance, reliability, observability, load testing, and AI-assisted engineering workflows across our product and infrastructure. Responsibilities Define and evolve SLIs and SLOs for critical product journeys. Improve and standardize observability, dashboards, and service-health visibility across teams. Investigate bottlenecks and regressions across application, database, asynchronous, and system layers. Drive improvements in latency, throughput, scalability, and reliability. Build structured load-testing workflows for critical paths. Help teams validate system behavior under realistic traffic, concurrency, and tenant-scale conditions. Analyze capacity, saturation, and behavior under peak load and growth scenarios. Define practices and tooling to prevent performance regressions before production. Work closely with product and infrastructure teams to align on performance priorities and system behavior under load. Design AI-assisted workflows to support metric and alert interpretation, anomaly analysis, incident investigation, performance insights generation, and more. Qualifications Strong hands-on experience improving performance, scalability, and reliability in complex software systems. Experience defining or operating SLIs, SLOs, and service-health frameworks. Strong knowledge of observability practices and tools such as Datadog. Experience investigating production bottlenecks across application, database, and distributed system layers. Experience building or improving load-testing, benchmarking, or performance validation workflows. Experience diagnosing tail-latency, throughput issues, and performance variability in production. Broad experience working with cloud-based production systems. Strong communication skills, including technical writing and cross-team alignment. Proactive mindset and strong ownership mentality. Preferred Experience Significant experience building and operating production systems at scale. Experience working in large-scale environments with meaningful traffic and operational complexity. Experience with Ruby on Rails, MySQL, Kafka, GraphQL, ClickHouse, or equivalent technologies. Previous experience in Performance Engineering or Reliability Engineering. Interest in modern AI tools and practical use of agentic workflows in engineering. Benefits High-growth, multicultural, and friendly environment. Private health insurance (Alan). Wellness program with gym, pool, and outdoor classes (Wellhub). Performance-based bonuses and equity. Paid parental leave and flexible working arrangements. Commitment to equal opportunities and workplace inclusion of people with disabilities.

Apply for this position