DevOps Engineer / Cloud & Monitoring Specialist
Role details
Job location
Tech stack
Job description
We are seeking a highly skilled DevOps Engineer with strong expertise in cloud technologies, monitoring tools, and CI/CD pipeline implementation. The ideal candidate will have hands-on experience with AWS and Google Cloud Platform environments, infrastructure automation using Terraform, containerization using Docker, and orchestration with Kubernetes. This role requires a proactive individual capable of managing deployments, ensuring system reliability, and collaborating with cross-functional teams in an Agile environment.
Key Responsibilities
- Requirements & Agile Collaboration
Collaborate with product owners and business analysts to understand and refine business requirements.
Convert requirements into well-defined user stories using Gherkin format.
Manage and maintain the product backlog and update it after sprints and production releases.
Participate in sprint planning, backlog grooming, and Agile ceremonies.
Identify risks, blockers, and dependencies, and communicate them to stakeholders.
- CI/CD & Deployment Management
Design, implement, and maintain CI/CD pipelines using Jenkins, GitHub Actions, and GitLab CI.
Build multi-branch pipelines using DSL and deploy applications to various environments based on parameter inputs.
Automate infrastructure provisioning using Terraform and Packer templates.
Containerize applications using Docker and deploy them on Kubernetes clusters (EKS/GKE).
Configure and manage Jenkins runners and deployment workflows.
Perform release management and ensure smooth deployment of critical applications.
- Cloud & Infrastructure Management
Provision, configure, and manage cloud infrastructure across AWS and Google Cloud Platform.
Work with AWS services including EC2, EKS, ELB, VPC, RDS, IAM, S3, CloudFront, Lambda, Route53, SNS, SQS, CloudWatch, and more.
Work with Google Cloud Platform services such as Compute Engine, Kubernetes Engine, App Engine, Cloud Storage, Cloud Functions, and VPC.
Manage networking, security, and system configurations across environments.
Build and maintain Terraform scripts for staging and production environments.
- Monitoring & Observability
Monitor applications and infrastructure using tools such as:
Nagios, Prometheus, Grafana
Loki Stack & Promtail
Datadog, Dynatrace
Amazon CloudWatch
ELK/EFK Stack
Server Density
Analyze logs, troubleshoot issues, and ensure high availability of systems.
Set up alerts and dashboards for proactive monitoring.
- Build, Release & System Administration
Handle build and release processes using Maven, Jenkins, and Linux/Ubuntu systems.
Manage web/application servers such as Nginx and Apache Tomcat.
Configure messaging systems like RabbitMQ and Redis.
Perform infrastructure orchestration and migration activities.
- Code Quality & Reviews
Conduct peer and self-reviews for Infrastructure-as-Code (IaC).
Ensure adherence to coding and deployment standards.
Use SonarQube for code quality checks and ensure no critical vulnerabilities.
Manage version control, branching, and merging strategies using Git tools.
- Documentation
Prepare detailed deployment and technical documentation.
Maintain runbooks, SOPs, and best practice guidelines.
Assist team members in creating standard operational documentation.
Tools & Technologies
Cloud Platforms
AWS, Google Cloud Platform
Infrastructure & Automation
Terraform, Terragrunt, Packer, Boto3, Chef, Ansible, Rundeck
Containerization & Orchestration
Docker, Kubernetes, Rancher, Helm, Kustomize, Istio
CI/CD & DevOps Tools
Jenkins, GitHub Actions, GitLab CI, ArgoCD, Spinnaker, FluxCD (GitOps)
Monitoring & Logging
Nagios, Prometheus, Grafana, Loki Stack, ELK/EFK
Datadog, Dynatrace, CloudWatch, Server Density
Databases
MySQL, DynamoDB, Aurora, MongoDB, Percona
Programming & Scripting
Python (basic), Shell/Bash
Other Technologies
Redis, RabbitMQ, SonarQube, OpsGenie, SSO, Databricks, Our benefits and rewards program has been thoughtfully designed to recognize your skills and contributions, elevate your learning/upskilling experience and provide care and support for you and your loved ones. As an Apexon Associate, you get continuous skill-based development, opportunities for career advancement, and access to comprehensive health and well-being benefits and assistance.
Requirements
Must-Have Skills
Strong experience with monitoring tools (Nagios, Prometheus, Grafana, ELK/EFK, Datadog, Dynatrace, CloudWatch).
Hands-on experience with Terraform, Kubernetes, and CI/CD pipelines.
Solid understanding of AWS and Google Cloud Platform cloud platforms.
Expertise in Docker and container orchestration.
Experience in Linux/Ubuntu administration.
Good to Have
Knowledge of Python scripting.
Experience with GitOps tools like ArgoCD, FluxCD.
Familiarity with configuration management tools (Ansible, Chef).