Data Center / Cloud & Automation SME (Systems Engineer - L3 Infrastructure Support)
Role details
Job location
Tech stack
Job description
We are seeking a highly experienced Data Center / Cloud & Automation SME to provide L3 infrastructure engineering and operational support across enterprise data center and cloud environments. The ideal candidate will have deep hands-on expertise in Linux administration, AWS cloud engineering, Terraform, Ansible automation, infrastructure operations, incident management, and enterprise platform reliability., Linux Infrastructure Administration
Manage enterprise Linux infrastructure including:
- RHEL
- Amazon Linux
- Ubuntu
Responsibilities:
- Server installation and provisioning
- Patching and lifecycle management
- User administration
- Performance monitoring
- Troubleshooting and issue remediation
- Shell scripting automation
- System hardening
AWS Cloud Engineering
Design, provision, and maintain AWS infrastructure including:
- EC2
- VPC
- Subnets
- Security Groups
- IAM
- S3
- Load Balancers
- Auto Scaling
- Certificate Manager
- CloudWatch
Responsibilities:
- Secure cloud architecture
- Availability engineering
- Cost optimization
- Infrastructure scalability
- Cloud troubleshooting
- AWS operational governance, * AWS provisioning automation
- Standardized infrastructure deployments
- Multi-environment cloud consistency
- IaC governance
Configuration Management & Automation
Develop automation solutions using:
- Ansible
- Playbooks
- Roles
- OS automation
- Deployment orchestration
Responsibilities:
- Patch automation
- Configuration management
- Repetitive task automation
- Operational efficiency improvement
Version Control & DevOps Practices
Manage infrastructure code using:
- GitHub
- Git workflows
- Version control best practices
Preferred CI/CD exposure:
- Jenkins
- GitHub Actions
- GitLab CI
Monitoring, Incident Response & Troubleshooting
- Monitor systems, cloud resources, and infrastructure health.
- Handle:
- Incident management
- Root cause analysis
- Service restoration
- Preventive action planning
- Troubleshoot:
- Linux platform issues
- AWS infrastructure problems
- Automation failures
- Performance bottlenecks
Security & Compliance
Enforce security best practices including:
- Access controls
- IAM governance
- Encryption
- Patch management
- Secure OS configuration
- Compliance adherence
Cross-Functional Infrastructure Support
Collaborate with:
- Application teams
- Network teams
- Security teams
- Infrastructure operations
- Platform engineering teams
Responsibilities:
- Deployment support
- Change implementation
- Platform stability improvements
Documentation & Operational Governance
Create and maintain:
- SOPs
- Runbooks
- Infrastructure documentation
- Knowledge transfer materials
- Operational standards, * Windows Administration
- Citrix
- StoreFront
- Delivery Controllers
- NetScaler
- XenApp
- PAM
- Nutanix
- VMware
- Jenkins
- GitHub Actions
- GitLab CI
- Docker
- Kubernetes
- Python
- Prometheus
- Grafana
- ELK Stack
- CloudTrail
- AWX / Ansible Tower
- RDS
- Lambda
- DynamoDB
- Route53
- EKS
Requirements
- Linux Administration
- AWS Cloud
- Terraform
- Ansible
- Automation Engineering
- Bash Scripting
- Infrastructure Monitoring
- Incident Management
- Root Cause Analysis
- Git / GitHub
- Security Hardening
- Infrastructure Operations, 10+ years of total infrastructure / cloud engineering experience, including:
- Enterprise Linux administration
- AWS infrastructure support
- Infrastructure automation
- L3 production support
- Cloud operations engineering