Product Manager, Platform Operations & Observability - Evinova
Role details
Job location
Tech stack
Job description
We are looking for a Product Manager who thinks like an engineer, communicates like a strategist, and knows what good platform operations look like from the inside. You will own the product backlog and roadmap for Platform Operations and Observability at Evinova, working alongside the engineering lead for the team to define what gets built, in what order, and why. Platform Operations is responsible for the reliability, security, observability, and cost efficiency of Evinova's production platform. The team runs 24/7 operational support, owns incident response, drives the observability program, manages disaster recovery, and leads FinOps across the platform. Your job is to make sure their work is well-prioritized, clearly scoped, and connected to the outcomes that matter to the business. You will partner closely with the Platform Operations engineering lead within a globally distributed team, so you will be coordinating across time zones regularly. This is a Barcelona-based role requiring 60% time in the office. Accountabilities: Own the Roadmap
- Define and maintain a clear product roadmap for observability, monitoring, alerting, incident response tooling, and FinOps capabilities.
- Run the PlatformOps backlog: refinement, story writing, prioritization, and ensuring the team always knows what is next and why.
- Track platform operational metrics including MTTD, MTTR, repeat incident rates, and cost trends, and translate those signals into prioritization decisions.
- Communicate roadmap progress and platform health to engineering leads and senior stakeholders.
Elevate Observability
- Own the product requirements for how Evinova monitors and alerts on its platform, covering work across Splunk, AWS CloudWatch, Grafana, and related tooling.
- Drive signal quality improvements: less noise, faster detection, and actionable alerts.
- Collaborate with product and engineering teams across Evinova to make observability a standard part of how new services get built, not something added later.
- Define the onboarding standards and templates that make it straightforward for teams to bring new services into the observability platform consistently.
Reduce Operational Toil
- Translate the gap between reactive and proactive operations into a concrete product plan with clear priorities.
- Own requirements for incident response tooling, on-call workflows, post-incident review processes, and escalation automation.
- Build the product case for automation where manual processes are slowing the team down or introducing risk.
- Scope and prioritize Disaster Recovery and Backup capabilities to ensure RTO/RPO commitments can be met and regularly tested.
Keep Costs in Check
- Own the requirements for cost visibility tooling that surfaces cloud spend by environment, service, and workload in ways teams can act on.
- Treat cost optimization as an ongoing product discipline, not a one-time project.
- Build clear, evidence-based business cases for operational investments.
Bridge Operations and the Business
- Translate platform health, incidents, and operational trade-offs into updates that make sense to senior leadership and stakeholders.
- Work with engineering, security, and compliance teams to keep platform operations aligned with Evinova's regulatory requirements and risk posture.
- Represent the voice of the engineering teams that depend on PlatformOps in every roadmap and prioritization conversation.
Requirements
5 or more years in product management, with hands-on experience owning a product in a platform, infrastructure, SRE, DevOps, or cloud operations context.
-
Proven ability to define and execute against a product roadmap in an agile team, including writing clear user stories, running refinement, and making prioritization calls.
-
Solid working knowledge of AWS platform operations, enough to have credible technical conversations with senior engineers and challenge assumptions.
-
Familiarity with observability tooling: Splunk, Datadog, AWS CloudWatch, Grafana, or equivalent.
-
Working knowledge of SRE concepts: SLOs, SLIs, error budgets, MTTD/MTTR, on-call practices, and blameless post-incident review.
-
Strong written and verbal communication skills, with the ability to adapt for both engineering and executive audiences.
-
Comfortable working in agile teams and fluent with Jira, Confluence, and related tooling. Desirable Skills/Experience:
-
Background in health tech, regulated software, or GxP environments.
-
Experience with FinOps practices and cloud cost attribution models.
-
Familiarity with incident and change management frameworks.
-
Experience with on-call tooling such as PagerDuty or Splunk On-Call.
-
Degree in Computer Science, Engineering, or a related technical field. When we put unexpected teams in the same room, we unleash bold thinking with the power to inspire life-changing medicines. In-person working gives us the platform we need to connect, work at pace and challenge perceptions. That's why we work, on average, a minimum of three days per week from the office. But that doesn't mean we're not flexible. We balance the expectation of being in the office while respecting individual flexibility. Join us in our unique and ambitious world.