Monitoring Engineer

Humankind Global Recruitment
31 days ago

Role details

Contract type
Temporary to permanent
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Tech stack

Microsoft Windows
Bash
Databases
Linux
DevOps
DNS
Elasticsearch
Monitoring of Systems
Python
Nagios
Network Protocols
Ansible
Prometheus
Simple Network Management Protocols
TCP/IP
Scripting (Bash/Python/Go/Ruby)
Transport Layer Security
Cloud Monitoring
System Availability
Grafana
Reliability of Systems
Puppet
Splunk

Job description

  • Operational Excellence: Step into live monitoring operations, ensuring systems are running smoothly and SLAs are met.
  • Monitoring Transformation: Collaborate on a major transition of databases and monitoring solutions, with heavy use of Prometheus, Grafana, and Nagios.
  • Scripting & Automation: Write scripts (Python, Shell, Ansible, Puppet) to automate alerts, integrate monitoring solutions, and streamline workflows.
  • Tooling & Alerts: Configure, optimise, and maintain monitoring systems. Set up dashboards, pull requests, and integrations that ensure early warning and proactive response.
  • Cross-Functional Collaboration: Work closely with infra, DevOps, and application teams to resolve issues or escalate when required.
  • Continuous Improvement: Suggest and implement new approaches, contribute to best practices, and stay ahead of trends in observability and monitoring.
  • On-Call Rotation: Join the rota after ~6-9 months once you're confident and fully trained.

? What Success Looks Like

  • Meeting SLAs for incident response and resolution.
  • Helping ensure high availability and reliability of systems under monitoring.
  • Smooth delivery of monitoring transitions and migrations.
  • Contributing scripts, dashboards, and processes that improve visibility and reduce noise.
  • Earning trust within the team and wider business as the "go-to" for monitoring expertise., * Impact from day one: You'll take ownership of live monitoring operations and quickly move onto major projects.
  • Growth opportunities: Shape our observability strategy and get exposure to AIOps, hybrid/multi-cloud monitoring, and large-scale migrations.
  • Supportive team culture: Work with a close-knit Spanish team (5 engineers led by Connie) while being part of a larger 60-person infrastructure function.
  • Global collaboration: Partner with colleagues across the UK and Europe.

Requirements

  • 2+ years of hands-on experience as a Monitoring Engineer.
  • Strong systems administration skills across Linux/Unix and Windows.
  • Proficiency with monitoring tools (Nagios, Prometheus, Grafana, Elastic Stack, Splunk, etc.).
  • Advanced scripting/automation with Python, Ansible, Puppet, or Shell.
  • Knowledge of network protocols and security best practices (TCP/IP, DNS, TLS, SNMP).
  • Strong troubleshooting and problem-solving skills in high-pressure environments.
  • Excellent documentation and communication skills.
  • Ability to thrive in a fast-paced, collaborative, and global team.

Apply for this position