SRE Observability Engineer

NTT DATA
Charing Cross, United Kingdom
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote
Charing Cross, United Kingdom

Tech stack

Data analysis
Bash
Monitoring of Systems
Openshift
Prometheus
Software Deployment
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Lightspeed
Grafana
Kubernetes Helm Charts
Backend
Kubernetes
User Administration

Job description

The Monitoring and Observability team is responsible for managing:

  • Collaborating across various organizations within the company to understand and develop observability solutions for enterprise-wide deployment at scale.
  • Managing the legacy monitoring stack across the Production Management organization within the company.
  • Driving the strategic delivery of end-to-end Observability solutions in the company.
  • Providing in-depth analysis with interpretive thinking to define problems and develop innovative solutions.
  • Directly impacting the business by influencing strategic functional decisions through advice, counsel, or provided services.
  • Persuading and influencing others through strong and comprehensive communication and diplomacy skills
  • Operating with a global footprint
  • Performing other duties and functions as assigned.

Requirements

  • OpenShift/Kubernetes Administration: Experience deploying, managing, and troubleshooting containerized applications on OpenShift/Kubernetes, including resource management and networking.

  • Grafana & Observability Stack: o Proficiency in administering Geneos ITRS at scale. o Proficiency in administering Grafana (user management, data sources, dashboards, alerts). o Working knowledge of Grafana backend components: Mimir (metrics), Loki (logs), and Tempo (traces). o Experience with Prometheus for metric collection and PromQL for querying.

  • Helm Chart Management: Experience with Helm for deploying applications, including creating, modifying, and managing Helm charts, library charts, and dependencies.

  • Technical Documentation: Ability to create clear and concise documentation for systems and processes.

Desired Skills:

  • Application Deployment: Ability to deploy applications using Lightspeed Enterprise.
  • Google Cloud Operations: Experience with Google Cloud operations
  • Scripting & Automation: Experience with Bash or Python scripting for automating operational tasks.

Benefits & conditions

Our people are the most critical component of our long-term success and their health and wellbeing are our priority. You will enjoy a comprehensive, locally competitive benefits package.

About the company

NTT DATA is a $30 billion business and technology services leader, serving 75% of the Fortune Global 100. We are committed to accelerating client success and positively impacting society through responsible innovation. We are one of the world's leading AI and digital infrastructure providers, with unmatched capabilities in enterprise-scale AI, cloud, security, connectivity, data centers and application services. our consulting and Industry solutions help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have experts in more than 50 countries. We also offer clients access to a robust ecosystem of innovation centers as well as established and start-up partners. NTT DATA is a part of NTT Group, which invests over $3 billion each year in R&D.

Apply for this position