Senior Systems and Platform Engineer
Role details
Job location
Tech stack
Job description
We are seeking a skilled and experienced Senior Systems and Platform Engineer to join our
team. This critical role blends hands-on web administration with strategic platform
management, making you responsible for the implementation, maintenance, security, and
performance of our core web infrastructure and customer-facing platforms.
The ideal candidate will manage and maintain our servers, applications, databases, and
caches, ensuring their smooth operation and optimal performance.
You will manage system integration, operations, and upgrades, and act as the key technical
contact for both internal teams and customer stakeholders.
This position requires a blend of deep technical, hands-on work and strong collaborative
skills to ensure our systems scale effectively and perform reliably., Platform & Server Administration
- Configure, maintain, and optimise server software for high availability and security.
- Manage and troubleshoot web applications
- Manage, scale, and maintain performant database clusters and cache clusters
- Configure and integrate associated hardware, such as media players, displays, and network components.
- Coordinate and implement all system software updates, patches, and configuration changes.
Monitoring, Performance & Scaling
- Monitor server and platform performance, device connectivity, and capacity, implementing scaling strategies as needed.
- Ensure the overall health, high availability, and reliability of all web and platform deployments to meet SLAs.
- Provide technical leadership on scaling the infrastructure footprint to new stores or locations.
Integration & Troubleshooting
- Diagnose and resolve complex technical issues across hardware, software, applications, and networks to maintain high uptime.
- Support and manage infrastructure across cloud, on-premises, and hybrid environments.
Security, Compliance & Collaboration
- Implement and maintain robust web security measures to protect against vulnerabilities and cyber threats.
- Ensure compliance with web standards and protocols, including HTTP, SSL, DNS, and FTP.
- Perform regular system backups and execute disaster recovery procedures.
- Serve as the key technical contact for both internal development teams and external customer stakeholders.
- Deliver technical training, documentation, and best practices to customers and internal teams.
- Assist with general IT support as needed.
Technical Support & Troubleshooting
- Serve as the main technical contact for AMP in the U.K., supporting both internal teams and key customer stakeholders.
- Diagnose and resolve technical issues across hardware, software, and networks.
- Collaborate globally and escalate complex product issues as needed.
Customer &Stakeholder Collaboration
- Build strong relationships with customer IT and operations teams.
- Deliver training, documentation, and best practices.
- Participate in operational reviews and performance reporting., * EC2 (AMI creation, launch templates/configurations, auto-scaling groups, spot instances/fleets, Systems Manager, user data, instance types optimization)
- Load Balancers (ALB/NLB, target groups, listener rules, health checks, SSL termination, WAF integration) Security Groups, NACLs, VPC design, AWS Shield, WAF rules, and network security best practices
- Lambda (runtime environments, layers, concurrency, provisioning, EventBridge/Step Functions triggers and orchestration)
- ElastiCache (Redis and Valkey - cluster mode, replication groups, backup/restore, parameter tuning, node types, migration paths from Redis)
- RDS/Aurora (MySQL/MariaDB/PostgreSQL - multi-AZ deployments, read replicas, Performance Insights, automated backups, parameter groups, major/minor version upgrades)
- CloudWatch (metrics, alarms, logs, Events, Synthetics canaries, Contributor Insights, CloudWatch Logs Insights, dashboards, and integration with SNS/Slack for alerting)
Requirements
Do you have experience in Windows?, Do you have a Bachelor's degree?, Required
- Education: Bachelor's degree in Computer Science, Information Technology, Engineering, or equivalent related experience.
- Experience: 4-6 years of experience in systems engineering, web administration, or a related IT operations role. Managing multi-site Digital Signage platforms (AMP, Scala, Broadsign, Brightsign) is preferred.
Server Technology:
Proficiency in configuring and managing web server software like Apache and Nginx.
Databases & Caching:
Expertise in setting up and maintaining scalable PostgreSQL/MySQL/MariaDB databases and Redis/Valkey cache clusters.
Platforms & OS:
- Extensive hands-on Linux systems administration experience with Ubuntu (20.04/22.04/24.04) and Red Hat Enterprise Linux, including package management (apt/yum), service configuration (systemd), security hardening (SELinux/AppArmor, firewall/iptables/ufw, SSH hardening), log management, process monitoring, and performance tuning.
- Deep expertise in deploying, configuring, securing, and optimizing web server stacks and application runtimes (Nginx, Apache, PHP-FPM, Node.js, Python, Java/Tomcat, etc.) on Linux.
- Experience managing multi-site digital platforms (e.g., digital signage, CMS) and proficiency in both Linux and Windows environments.
AWS Expertise (Mandatory - Production-Level, Hands-On Experience Required)
Extended, in-depth knowledge and daily operational experience with the following AWS, * Familiarity with media player hardware and displays.
- Scripting skills (Bash mandatory; PowerShell and Python strongly preferred).
Soft Skills:
Excellent troubleshooting and problem-solving abilities with strong communication skills for both technical and non-technical audiences.
Preferred
- AWS certifications (SysOps Administrator, DevOps Engineer Professional, Solutions Architect Professional)
- Background in retail technology or Quick Service Restaurant environments.
- Knowledge of digital media formats and content delivery workflows.
- Experience with audience measurement or sensor technology.