Data Center/Cloud and Automation SME
Role details
Job location
Tech stack
Job description
-
Provision, configure, and maintain AWS infrastructure ensuring availability, security, and scalability.
-
Develop and maintain Infrastructure as Code using Terraform for consistent cloud deployments.
-
Automate system configuration, deployments, and operational tasks using Ansible.
-
Monitor systems and cloud resources; handle incidents, changes, and service requests.
-
Perform root cause analysis and implement preventive actions
-
Enforce security best practices across OS and cloud environments.
-
Collaborate with cross functional teams to support application deployments and platform stability.
-
Maintain documentation, SOPs, and runbooks
-
Manage code/version control through GitHub
-
Continuously improve automation, reliability, and operational efficiency.
Roles & Responsibilities
-
Manage and support Linux servers including installation, patching, user management, performance monitoring, and troubleshooting.
-
Design, provision, and maintain AWS infrastructure (EC2, Certificate Manager, Security Groups, VPC, IAM, S3, Load Balancers) ensuring security, availability, and cost efficiency.
-
Implement Infrastructure as Code using Terraform to build, update, and manage AWS resou rces in a consistent and reusable manner.
-
Automate system configuration, deployments, and patching using Ansible playbooks and roles.
-
Code version control through Github.
-
Monitor systems and cloud resources, respond to incidents, and perform root cause analysis.
-
Follow security best practices, access controls, and compliance requirements across OS and cloud platforms.
-
Collaborate with application, network, and security teams to support deployments and changes.
-
Maintain documentation, SOPs, and continuously improve automation and operational efficiency.
Requirements
Do you have experience in Windows support?, * Linux Administration: Strong hands on experience with RHEL / Amazon Linux / Ubuntu, including user management, patching, troubleshooting, shell scripting, and performance monitoring.
-
AWS Cloud: Practical knowledge of core AWS services such as EC2, VPC, Subnets, Certificate Manager,IAM,S3, Load Balancers, Auto Scaling, and CloudWatch, with security and cost optimization awareness.
-
Infrastructure as Code (Terraform): Experience in writing and managing Terraform modules, variables, and state files to provision and maintain AWS infrastructure.
-
Configuration Management (Ansible): Ability to create and manage Ansible playbooks and roles for OS configuration, automation, and deployment tasks.
-
Version Control: Working knowledge of Git for code versioning and collaboration.
-
Understanding of networking fundamentals
-
Automation & Scripting: Proficiency in Bash scripting and automation of repetitive operational tasks.
-
Monitoring & Troubleshooting: Experience in system and cloud monitoring, incident handling, and root cause analysis.
Security & Compliance: Understanding of access control, encryption, patch management, and secure configuration practices.
-
Manage and support Linux servers, including installation, patching, monitoring, and troubleshooting., * Primary Skills : Linux, AWS, Terraform & Ansible, Automation
-
Secondary Skills : Citrix, Nutanix and VMware
Generic Managerial Skills:
-
Windows Administration, Citrix - storefront, Delivery controllers, netscaler, PAM, XenApp Administration.
-
Experience in Nutanix and VMware
-
CI/CD Tools: Exposure to Jenkins, GitHub Actions, GitLab CI, or similar pipelines.
-
Containers & Orchestration: Basic knowledge of Docker and Kubernetes.
-
Advanced AWS Services: Familiarity with RDS, Lambda, DynamoDB, Route53, or EKS.
-
Monitoring & Logging Tools: Experience with tools like Prometheus, Grafana, ELK stack, or CloudTrail.
-
Ansible AWX / Tower: Experience with Ansible Tower or AWX for centralized automation.
-
Terraform Advanced Usage: Exposure to multi account setups, workspaces, and complex module design.
-
Security Practices: Knowledge of vulnerability scanning, security audits, and compliance requirements.
-
ITIL / ITSM Awareness: Experience working with incident, problem, and change management processes.
-
Scripting Languages: Basic knowledge of Python for automation tasks.
-
Documentation & Knowledge Sharing: Ability to create SOPs, runbooks, and technical documentation.