Cloud Engineer
Role details
Job location
Tech stack
Job description
We are seeking an Cloud Engineer to join our Operations team, supporting the cloud, data and backend systems that underpin a range of Yunex Traffic products. You'll work closely with Infrastructure Support, Application Support and Development teams to keep services reliable, secure and well-run - with a strong focus on SQL-backed platforms and the data that drives them.
This role suits someone who enjoys variety: cloud operations, troubleshooting, and digging into SQL/data issues to find the real root cause. We're looking for an inquisitive mindset, solid fundamentals, and the drive to learn and improve things.
Responsibilities:
Cloud & Infrastructure Support
-
Troubleshoot and resolve infrastructure and application incidents, working through symptoms to root cause.
-
Carry out day-to-day operational tasks such as updates, configuration changes, health checks and environment verification (including database/service checks where relevant).
-
Support and maintain AWS services used across our operational platforms (e.g. EC2, VPC, ECS/EKS, RDS, MSK, CloudFormation, CloudWatch).
System Operations & Monitoring
-
Maintain and improve monitoring and alerting so services stay healthy, performant and supportable.
-
Work within established incident and problem management processes, keeping notes clear and producing concise operational updates.
-
Investigate SQL/database issues in live environments - performance, failed jobs, data integrity and access - and work with engineers to agree safe fixes.
-
Support deployment activities (including database changes where applicable) and work with developers to ensure smooth releases to test and production environments.
Technical Collaboration
-
Work with Application Support, Development, and QA teams to understand how systems operate end-to-end.
-
Provide clear and effective communication during incidents and investigations.
-
Contribute to technical discussions and help design and implement agreed solutions.
-
Suggest improvements, automate repeatable tasks, and improve observability so we spot issues sooner.
Continuous Improvement & Documentation
-
Identify opportunities to improve reliability, maintainability, testability and efficiency - from cloud configuration through to SQL/database performance.
-
Keep documentation up to date: diagrams, operational procedures, runbooks and SQL/data support notes that help others move faster.
-
Assist with automation using scripting or infrastructure-as-code, including repeatable operational checks and data maintenance tasks where appropriate.
Security & Compliance Support
-
Support patching cycles, vulnerability reviews, and security investigations.
-
Assist with DR testing, backups, and operational readiness activities.
-
Follow internal policies and contribute to compliance requirements as needed.
Customer & Stakeholder Engagement
-
Support customer queries and internal reviews relating to infrastructure or application behaviour.
-
Provide clear, customer-focused communication when assisting with escalations or investigations.
Requirements
-
Hands-on experience supporting production systems, with a good grounding in SQL and database-backed applications.
-
Working knowledge of core AWS services (e.g., EC2, VPC, ECS/EKS, RDS, CloudFormation, CloudWatch) and how they fit together in real environments.
-
Good understanding of Linux and/or Windows Server administration (including services that support databases and backend applications).
-
Database experience (SQL Server and/or MySQL): performance troubleshooting, query plans/index basics, scheduled jobs, backups/restore concepts.
-
Basic networking understanding (routing, firewalls, security groups, VPN concepts).
-
A methodical approach to troubleshooting - comfortable working with logs, metrics, alerts and data to find what's really going on.
-
Familiarity with scripting (PowerShell, Bash, Python, or SQL scripting) to automate operational tasks.
-
Clear communication skills - able to explain technical issues simply and keep people informed during incidents.
-
Ability to effectively use and integrate AI-assisted activities into daily work.
-
A proactive mindset: you spot patterns, ask good questions, and enjoy learning new tools and techniques.
Desirable
-
Experience with CI/CD tools (e.g., GitLab CI/CD) and supporting release pipelines.
-
Exposure to infrastructure-as-code (CloudFormation, CDK, Terraform) and version-controlled change.
-
Agentic AI development skills across various LLM's.
-
Experience with observability tooling (dashboards, log aggregation, tracing) and using data to improve on-call and reliability.
-
Experience with queuing/messaging technologies (Kafka, RabbitMQ, ActiveMQ).
-
Knowledge of traffic management or intelligent transport systems.
Benefits & conditions
-
Competitive Package including an annual bonus.
-
Continuous training and learning opportunities to support career development.
-
26 days of holiday, increasing up to 29 with length of service.
-
37.5-hour working week.
-
Excellent pension, with matching contributions up to 10% of pensionable salary.
-
Flexible benefits package to suit your personal needs.
-
Investment in personal development and support for membership of professional institutions.