Site Reliability Engineer
Role details
Job location
Tech stack
Job description
MySQL CI/CD MongoDB Jenkins OpenShift Dynatrace Operations Consulting Management Automation Innovation Resilience Kubernetes Reliability Spring Boot Web Services Apache Kafka Systems Design Operating Systems Agile Methodology Docker (Software) Performance Tuning Business Valuation Software Solutions Sustainable Systems Full Stack Development Operational Excellence IT Capacity Management Artificial Intelligence Development Environment Business Transformation Internet Protocols Suite Service Level Objectives SQL (Programming Language) Java (Programming Language) Node.js (Javascript Library) Git (Version Control System) Site Reliability Engineering React.js (Javascript Library) Troubleshooting (Problem Solving) Transmission Control Protocol (TCP) Simple Object Access Protocol (SOAP), As a Retail Site Reliability Engineer (SRE) within Client's Site Reliability Center, you will combine your software and systems expertise to manage applications and create innovative, automated solutions to simplify operations, eliminate toil, and increase the reliability and availability of our critical applications and business services.
Objective
-
Run the production environment by monitoring availability and taking a holistic view of system health
-
Build software and systems to manage platform infrastructure and applications
-
Improve reliability, quality, and time-to-market of our suite of software solutions
-
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement
-
Provide primary operational support and engineering for multiple large-scale distributed software applications, * Strong working knowledge of modern development technologies and tools such as Agile, CI/CD, Git and Jenkins
-
Strong working knowledge of Internet protocols such as HTTP, TCP/UDP
Responsibilities of this position
-
Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding
-
Partner with development teams to improve services through rigorous testing and release procedures
-
Participate in system design consulting, platform management, and capacity planning
-
Create sustainable systems and services through automation and uplifts
-
Balance feature development speed and reliability with well-defined service-level objectives
-
Partner with technology teams across the enterprise to establish SRE best practices and automated solutions with a focus on operational excellence
-
Identify opportunities to evangelize adoption for greater self-healing and resiliency patterns
-
Troubleshoot priority incidents and participate in blameless post-mortems
-
Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions.
-
Participate in 24x7 on-call rotations and escalation workflows, Use of Artificial Intelligence (AI): We may use Artificial Intelligence (AI) to support parts of our hiring process, including sourcing, screening, and evaluating candidates. AI helps assess applications and qualifications, but final decisions are made by our hiring team. By applying, you acknowledge and agree that your application may be reviewed using AI tools. Related Jobs Early Career - Site Reliability Engineer TEKsystems Phoenix, AZOn-Site JSON MySQL CI/CD MongoDB Jenkins OpenShift Dynatrace Operations Consulting Management Automation Innovation Resilience Kubernetes Reliability Spring Boot Web Services Apache Kafka Systems Design Operating Systems Agile Methodology Docker (Software) Performance Tuning Business Valuation Software Solutions Sustainable Systems Full Stack Development Operational Excellence IT Capacity Management Artificial Intelligence Development Environment Business Transformation Internet Protocols Suite Service Level Objectives SQL (Programming Language) Java (Programming Language) Node.js (Javascript Library) Git (Version Control System) Site Reliability Engineering React.js (Javascript Library) Troubleshooting (Problem Solving) Transmission Control Protocol (TCP) Simple Object Access Protocol (SOAP) +0 Site Reliability Engineer TEKsystems Chandler, AZRemote Unix Linux CI/CD DevOps Splunk Triage Ansible Jenkins Budgeting Dashboard Terraform Bitbucket Dynatrace Operations Automation Resilience Middleware Artifactory Communication Observability Apache Hadoop Apache Tomcat Agile Methodology Business Valuation Incident Management Oracle SQL Developer Full Stack Development Application Monitoring Artificial Intelligence Business Transformation Service Level Objectives Scrum (Software Development) Site Reliability Engineering Python (Programming Language) Troubleshooting (Problem Solving) +0
Salesforce Developer Site Reliability Engineer TEKsystems Chandler, AZ*Remote Unix Linux CI/CD DevOps Splunk Triage Ansible Jenkins Budgeting Dashboard Terraform Bitbucket Dynatrace Operations Automation Resilience Middleware Artifactory Communication Observability Apache Hadoop Apache Tomcat Agile Methodology Business Valuation Incident Management Oracle SQL Developer Full Stack Development Application Monitoring Artificial Intelligence Business Transformation Service Level Objectives Scrum (Software Development) Site Reliability Engineering Python (Programming Language) Troubleshooting (Problem Solving) +0
Requirements
Python, java, sql, oracle
Top Skills Details
Python, java, sql, oracle
Additional Skills & Qualifications
Candidates should possess experience and detailed knowledge in the following
-
Engineering and support of micro app/service architectures
-
Working knowledge of web services technologies such as SOAP, JSON and REST
-
Application Platforms such as OpenShift, Docker and Kubernetes
-
Development Frameworks such as React, Node.js and Spring Boot
-
Expertise in database technologies including SQL Server, MySQL, Oracle and Mongo
-
Understanding of distributed tracing and monitoring tools such as Dynatrace, Jaeger and Humio
-
Experience with Kafka event streaming
Benefits & conditions
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
- Medical, dental & vision
- Critical Illness, Accident, and Hospital
- 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
- Life Insurance (Voluntary Life & AD&D for the employee and dependents)
- Short and long-term disability
- Health Spending Account (HSA)
- Transportation benefits
- Employee Assistance Program
- Time Off/Leave (PTO, Vacation or Sick Leave) Workplace Type