Site Reliability Engineering

Geotab Inc.
Atlanta, United States of America
12 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Remote
Atlanta, United States of America

Tech stack

Amazon Web Services (AWS)
Apache HTTP Server
Azure
Bash
Big Data
Google BigQuery
C Sharp (Programming Language)
Cloud Computing
Configuration Management
Computer Networks
Databases
Linux
DNS
Monitoring of Systems
Hypertext Transfer Protocols (HTTP)
Python
PostgreSQL
Log Analysis
Octopus Deploy
Powershell
Reliability Engineering
Ansible
Prometheus
TCP/IP
Scripting (Bash/Python/Go/Ruby)
Application Enhancement Tool
Load Balancing
System Availability
Grafana
Kubernetes
Gsuite
Microservices

Job description

  • Act as a primary escalation point for critical production application/product issues.
  • Rapidly troubleshoot complex problems across the application stack, utilizing observability tools to identify root causes.
  • Coordinate effectively with development, infrastructure, and other technical teams during incidents to implement fixes and restore service swiftly.
  • Clearly communicate incident status, impact, and resolution steps to internal stakeholders.
  • Collaborate with team members to improve monitoring tools, dashboards, and alerting mechanisms for proactive detection of issues impacting Critical User Journeys (CUJs) within the application/product and computing architecture. Our complex environment encompasses monolithic applications, microservices, and a vast ecosystem of millions of hardware units.
  • Monitor application/product and system health proactively using a combination of tools to ensure high availability and adherence to Service Level Objectives (SLOs) / Service Level Agreements (SLAs).
  • Identify opportunities and implement automation tools/scripts to streamline routine operational tasks, reduce manual effort (toil), and improve response times.
  • Conduct system tests to validate performance, reliability, and successful remediation of issues.
  • Recommend design and process enhancements based on operational experience to improve overall application reliability and maintainability.
  • Participate in post major incident reviews (PMIRs) to analyze disruptions, document findings, track corrective actions to prevent recurrence, and identify areas of improvement for incident response processes.
  • Contribute to building a culture of learning from incidents.
  • Participate in a 24x7 on-call rotation to provide timely support for critical issues outside of business hours.

Requirements

  • 3 - 5 years experience in SRE/DevOps/Tier 3.
  • Strong troubleshooting skills with a systematic problem-solving approach.
  • Extensive experience resolving critical incidents in production environments.
  • Strong proficiency in Linux and operational scripting (Bash, Powershell, Python).
  • Experience with database/dataset querying (GoogleSQL, PostgreSQL, BigData), automated configuration management (via tools like Ansible), and GitOps tools (Argo CD).
  • Experience with data visualization platforms (e.g., Apache Superset/BigQuery Visualizations).
  • Familiarity with cloud platforms (GCP/Azure/AWS), container orchestration (Kubernetes), and monitoring/alerting systems (e.g., Prometheus stack including AlertManager/Grafana).
  • Understanding of application environments (e.g., .NET/C#) for troubleshooting purposes.
  • Understanding of fundamental networking concepts (TCP/IP, HTTP, DNS, Load Balancing) are considered assets.
  • Familiarity with applying AI-powered tools to enhance operational efficiency in areas such as log analysis, troubleshooting assistance, incident summarization, and automation scripting.
  • Demonstrated ability to work well under pressure and manage multiple tasks and projects simultaneously.
  • Experience with incident management processes.
  • Experience working within a technical or engineering organization with knowledge of the high-technology industry is considered an asset.
  • Excellent verbal and written communication skills.
  • Strong analytical skills with the ability to problem solve and develop well-judged decisions.
  • Strong team player with the ability to engage with all levels of the organization.
  • Technical competence using software programs, including but not limited to, Google Suite for business (Sheets, Docs, Slides) or equivalents
  • Entrepreneurial mindset and comfortable in a flat organization.
  • To be eligible, candidates must have continuously resided in the continental United States for at least three years immediately preceding their application. Successful applicants will be required to provide verifiable documentation of continuous lawful residency. Some exceptions may apply to US citizens.
  • Ability to pass an enhanced background check, including a drug screening test (if applicable) and a credit check.

Benefits & conditions

Flex working arrangements Home office reimbursement program Baby bonus & parental leave top up program Online learning and networking opportunities Electric vehicle purchase incentive program Competitive medical and dental benefits Retirement savings program

  • The above are offered to full-time permanent employees only

About the company

Geotab ® is a global leader in IoT and connected transportation and certified "Great Place to Work ." We are a company of diverse and talented individuals who work together to help businesses grow and succeed, and increase the safety and sustainability of our communities. Geotab is advancing security, connecting commercial vehicles to the internet and providing web-based analytics to help customers better manage their fleets. Geotab's open platform and Geotab Marketplace ®, offering hundreds of third-party solution options, allows both small and large businesses to automate operations by integrating vehicle data with their other data assets. Processing billions of data points a day, Geotab leverages data analytics and machine learning to improve productivity, optimize fleets through the reduction of fuel consumption, enhance driver safety and achieve strong compliance to regulatory changes. Our team is growing and we're looking for people who follow their passion, think differently and want to make an impact. Ours is a fast paced, ever changing environment. Geotabbers accept that challenge and are willing to take on new tasks and activities - ones that may not always be described in the initial job description. Join us for a fulfilling career with opportunities to innovate, great benefits, and our fun and inclusive work culture. Reach your full potential with Geotab. To see what it's like to be a Geotabber, check out our blog and follow us @InsideGeotab on Instagram. Join our talent network to learn more about job opportunities and company news., At Geotab, we have adopted a flexible hybrid working model in that we have systems, functions, programs and policies in place to support both in-person and virtual work. However, you are welcomed and encouraged to come into our beautiful, safe, clean offices as often as you like. When working from home, you are required to have a reliable internet connection with at least 50mb DL/10mb UL. Virtual work is supported with cloud-based applications, collaboration tools and asynchronous working. The health and safety of employees are a top priority. We encourage work-life balance and keep the Geotab culture going strong with online social events, chat rooms and gatherings. Join us and help reshape the future of technology! Geotab verifies candidates' eligibility to work in the United States through E-Verify, an internet-based system operated by U.S. Citizen and Immigration Services., employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training. Geotab expressly prohibits any form of workplace harassment or discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. Improper interference with the ability of Geotab's employees to perform their job duties may result in discipline up to and including discharge. If you would like more information about our EEO program or wish to file a complaint, please contact our EEO officer, Klaus Boeckers at HRCompliance@geotab.com. For more details, view a copy of the EEOC's Know Your Rights poster. By submitting a job application to Geotab Inc. or its affiliates and subsidiaries (collectively, "Geotab"), you acknowledge Geotab's collection, use and disclosure of your personal data in accordance with our

Apply for this position