Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Duties: Assist in creating simple, modular, extensible, and functional design for the product/solution in adherence to the requirements. Evaluate trade-offs while designing across multiple components in a product based on business requirements. Convert HLD to create detailed design using mock screens, pseudo codes, and detailed functional logic of the modules for specific modules and components of a product/system. Understand nuances of designing for disaster recovery. Design and create MVP to clarify requirements and design and uncover risks. Refine the MVP design for early defects and revised customer requirements. Undertake infrastructure coding automation. Adhere to all relevant coding guidelines while writing/configuring code. Create/configure minimalistic (less complex, highly robust, and high quality) code for a component/module under guidance. Maintain records by documenting program development and revisions. Stay updated on the prevalent coding languages and frameworks in the industry outside the immediate scope of delivery. Identify repetitive and routine tasks in (Continuous Integration/Continuous Delivery) CI/CD, testing, or any other process that can be automated. Implement telemetry features as required under guidance. Apply security policy requirements to component/module during code development/configuration. Detect and document defects, bugs, and errors for assigned component/module and conduct analysis to determine the sources under guidance. Troubleshoot performance and availability bottlenecks for assigned application under guidance. Work with business partners to identify and document critical applications. Interpret and follow procedures in contingency plans. Explain the contingency and disaster recovery plans for assigned environment. Execute established procedures necessary to continue operations in an emergency. Participate in the design of a minimum operating environment for a computer-based facility. Utilize established criteria (for example, probability of failure, frequency of failure) to measure site reliability. Monitor site reliability conditions and new reliability requirements. Assist in the design and development of a reliability program plan for a specific site environment. Apply appropriate tools, services, or applications for reliability prediction and other site improvements. Research and assess various reliability models for different site environments. Suggest metrics to monitor software or system performance. Monitor current performance data to ensure compliance with defined SLOs for multiple applications/systems. Determine thresholds for monitoring metrics and triggers alerts based on thresholds. Help with specific procedures to proactively check the health of applications and infrastructure, including a variety of operating systems, hardware, and software. Make recommendations regarding situational awareness and alerting. Make recommendations regarding instrumentation gaps and alerting logic, including a variety of operating systems, hardware, and software.
Requirements
Minimum education and experience required: Master's degree or the equivalent in Computer Science, Computer Engineering, Computer Information Systems, Software Engineering, Electrical Engineering, or related area and 1 year of experience in site reliability engineering, site and system administration, infrastructure management, or related area; OR Bachelor's degree or the equivalent in Computer Science, Computer Engineering, Computer Information Systems, Software Engineering, Electrical Engineering, or related area and 3 years of experience in site reliability engineering, site and system administration, infrastructure management, or related area.
Skills required: Experience with the management and orchestration of Kubernetes cluster with helm charts. Experience with networking solutions including VPN systems, firewall technologies, and storage systems. Experience building scalable monitoring and observability systems using CloudWatch, PRTG, Grafana, and PagerDuty. Experience with server management in AWS with orchestration tools, including Ansible, Puppet, and Terraform. Experience managing DNS and SSL certificates in AWS. Experience managing Enterprise Workloads in an AWS Infrastructure. Experience building CI/CD pipelines using GitHub Action, CodeBuild, CodePipeline, and CircleCI. Experience managing RDBMS including PostgreSQL and MSSQL Server and non-RDBMS including Redshift and MongoDB. Experience writing unit and integration tests. Experience with tool development, including scripting with BASH and high level languages: Python and Typescript. Employer will accept any amount of experience with the required skills.
Benefits & conditions
Salary Range: $112,923/year to $180,000/year. Additional compensation includes annual or quarterly performance incentives.
Benefits: At Walmart, we offer competitive pay as well as performance-based incentive awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty and voting. Other benefits include short-term and long-term disability, education assistance with 100% company paid college degrees, company discounts, military service pay, adoption expense reimbursement, and more.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms. For information about benefits and eligibility, see One.Walmart.com.
Wal-Mart is an Equal Opportunity Employer.
#LI-DNI #LI-DNP
Walmart and its subsidiaries are committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.
Walmart is committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.
Benefits you'll enjoy
Discount Card
Get 10% off
Walmart associates are eligible for a 10% discount card on most regular-priced items and fresh produce in-store and on select items at Walmart.com. Eligible after 90 days of employment.
Live Better U
100% covered
Earn a degree or in-demand skills certificates with no debt- Walmart covers 100% of tuition and books. Live Better U offers 60+ programs for Associates to pursue their dreams.
Walmart Academy
Grow your skills
Ready to grow your career? Walmart Academy offers job-specific retail training and leadership courses to help Associates reach their career goals.
Financial perks
Enjoy 401(k) matching and stock purchase plans.
Paid time off
Take a break as needed for vacation, sick leave, holidays, parental leave, and more
Comprehensive health benefits
Medical, dental, vision, and wellness programs for you and your family
Wellbeing programs
Access mental health resources and assistance programs for life's challenges
Career growth opportunities
Training, leadership programs, and clear paths to advance.
Learn more
Life at VIZIO: Driven by Our People