DevOps/Site Reliability Engineer
Baanyan Software Services, Inc.
Edison, United States of America
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 45KJob location
Edison, United States of America
Tech stack
ActiveMQ
Agile Methodologies
Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Azure
Bash
Unix
Cloud Computing
Cloud Engineering
Computer Networks
Continuous Integration
Cursor (Graphical User Interface Elements)
Linux
DevOps
Disaster Recovery
DNS
Github
Revision Control Systems
Monitoring of Systems
Identity and Access Management
Virtual Private Networks (VPN)
Python
Enterprise Messaging Systems
Windows Server
Openshift
Powershell
RabbitMQ
Reliability Engineering
Ansible
Prometheus
Shell Script
Vault (Revision Control System)
Datadog
Data Logging
Transport Layer Security
Google Cloud Platform
Load Balancing
GitHub Copilot
Istio
System Availability
Grafana
Kubernetes Helm Charts
Software Troubleshooting
Amazon Web Services (AWS)
Gitlab
GIT
Cloudformation
Amazon Web Services (AWS)
Containerization
Gitlab-ci
Kubernetes
Infrastructure Automation Frameworks
Kafka
Linkerd (Service Mesh)
Bitbucket
Cloudwatch
Terraform
Splunk
New Relic (SaaS)
GPT
Software Version Control
Dynatrace
Devsecops
Docker
Security Orchestration, Automation & Response
ELK
Jenkins
Vulnerability Analysis
Job description
- Design, implement, and maintain scalable, highly available, and secure cloud infrastructure.
- Automate infrastructure provisioning, deployment, monitoring, and incident response processes.
- Build and manage CI/CD pipelines for application deployments across multiple environments.
- Manage containerized workloads using Docker and Kubernetes in cloud-native environments.
- Collaborate with development, QA, and security teams to improve software delivery and operational excellence.
- Monitor system performance, troubleshoot production issues, and ensure high availability and reliability.
- Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, or similar tools.
- Manage logging, monitoring, alerting, and observability platforms.
- Ensure security, compliance, and best practices across cloud and infrastructure environments.
- Participate in on-call rotations and incident management activities.
- Optimize cloud infrastructure costs and improve system performance.
Requirements
Do you have experience in Version control?, * 7+ years of hands-on experience in DevOps, Site Reliability Engineering (SRE), or Cloud Engineering.
- Strong experience with AWS, Azure, or Google Cloud Platform (GCP).
- Expertise in Kubernetes and Docker containerization technologies.
- Hands-on experience with Infrastructure as Code (Terraform, CloudFormation, Ansible).
- Strong experience building and maintaining CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI/CD, or Azure DevOps.
- Experience with Linux/Unix administration and shell scripting (Bash, Python).
- Strong understanding of networking concepts, load balancing, DNS, SSL/TLS, VPNs, and security best practices.
- Experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, ELK Stack, Splunk, New Relic, or Dynatrace.
- Experience with source control systems such as Git and GitHub/GitLab.
- Strong troubleshooting, incident management, and root cause analysis skills.
- Experience supporting highly available, distributed production systems.
- Familiarity with Agile and DevOps methodologies.
Preferred Skills:
- Experience with service mesh technologies such as Istio or Linkerd.
- Experience with Kubernetes Operators and Helm Charts.
- Knowledge of AWS EKS, ECS, Lambda, EC2, RDS, S3, CloudWatch, IAM, and VPC.
- Experience implementing DevSecOps practices and security automation.
- Familiarity with SRE principles, including SLI, SLO, SLA, error budgets, and reliability engineering.
- Experience with Kafka, RabbitMQ, or other messaging platforms.
- Hands-on experience with disaster recovery, backup strategies, and business continuity planning.
- Certifications such as AWS Certified DevOps Engineer, AWS Solutions Architect, CKA, CKAD, or Terraform Associate.
- Experience using AI-powered development and operations tools such as GitHub Copilot, Windsurf, Cursor AI, Amazon Q, ChatGPT, and AI-driven observability platforms.
Technologies:
Cloud: AWS, Azure, GCP Containers: Docker, Kubernetes, OpenShift CI/CD: Jenkins, GitHub Actions, GitLab CI/CD, Azure DevOps IaC: Terraform, CloudFormation, Ansible Monitoring: Prometheus, Grafana, Datadog, Splunk, ELK Stack, New Relic Scripting: Python, Bash, PowerShell Source Control: Git, GitHub, GitLab, Bitbucket Operating Systems: Linux, Unix, Windows Server Messaging: Kafka, RabbitMQ, ActiveMQ Security: IAM, Vault, DevSecOps, Security Scanning Tools