DevOps Engineer

Electronic Transaction Consultants, LLC
Frisco, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Frisco, United States of America

Tech stack

Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Google BigQuery
Software as a Service
Cloud Computing
Continuous Integration
Data Infrastructure
Data Synchronization
Data Warehousing
Software Debugging
DevOps
Disaster Recovery
Distributed Systems
Github
Hadoop
Monitoring of Systems
Python
Message Broker
Enterprise Messaging Systems
Octopus Deploy
PCI Data Security Standards
Prometheus
Software Engineering
Software Systems
Data Streaming
Data Processing
Load Balancing
Cloud Platform System
System Availability
Delivery Pipeline
Snowflake
Grafana
Spark
Reliability of Systems
HybridCloud
Cloudformation
Data Lake
Kubernetes
Information Technology
Data Analytics
Build Tools
GraphQL
Terraform
Stream Processing
Data Pipelines
Microservices

Job description

Join our forward thinking team building the next generation of Intelligent Transportation Systems! We're looking for a seasoned DevOps Engineer who's equally excited about cutting-edge technology and having an outsized impact on the future of transportation. Operating as a small startup within a larger company, we move fast, learn constantly, and build solutions that will scale with our growing SaaS platform. You'll work with our modern tech stack (Go, Python, NATS, Numaflow, Kubernetes, Dagster, GraphQL, Terraform) to build and maintain infrastructure that supports real-time data processing across cloud and on-premise environments. We're seeking someone who's not just technically excellent, but genuinely curious, quirky, and thrives in an environment where your contributions directly shape the product the team. Responsibilities:

  • Core Area 1: Infrastructure Management and Automation
  • Implement and manage production-grade Kubernetes clusters using Argo CD for GitOps deployments and Terraform for Infrastructure as Code
  • Build and maintain scalable infrastructure solutions for both cloud and on-premise environments
  • Implement robust CI/CD pipelines with a focus on containerized applications
  • Manage infrastructure performance, ensuring high availability for mission-critical systems
  • Core Area 2: Data Infrastructure
  • Maintain our S3-based Data Lake infrastructure integrated with Dagster for scalable data orchestration
  • Manage and optimize our NATS messaging system for real-time event streaming and communication
  • Manage infrastructure to performantly run our Numaflow pipelines for real-time stream processing and reliable data flow
  • Scale data infrastructure to support growing SaaS platform and expanding customer base
  • Core Area 3: Edge and Hybrid Cloud Operations
  • Deploy and manage on-premise/edge computing infrastructure
  • Implement hybrid cloud solutions that seamlessly integrate edge deployments with centralized cloud infrastructure
  • Ensure reliable connectivity and data synchronization between edge nodes and central systems
  • Optimize compute resource allocation across distributed computing environments
  • Core Area 4: Development Collaboration
  • Work closely with our development teams to optimize containerized application deployment of Go and Python code
  • Manage and enhance our Bazel build system and evaluate complementary build tools
  • Collaborate with our small team to rapidly prototype, test, and deploy new features
  • Core Area 5: Security and Compliance
  • Integrate security practices into the CI/CD pipeline to ensure that all software releases meet stringent security standards.
  • Maintain compliance with industry regulations (e.g., PCI DSS, GDPR) and internal security policies to ensure that sensitive data is protected.
  • Stay current with security trends and emerging threats, implementing updates and patches to mitigate vulnerabilities.
  • Core Area 6: Monitoring, Performance, and Reliability Implement comprehensive monitoring solutions for distributed systems, data pipelines, and message brokers Proactively identify and resolve performance bottleneck in data processing
  • Ensure system reliability and disaster recovery capabilities across cloud and edge environments

This list of responsibilities might not cover everything you'll end up doing.

Requirements

Do you have experience in Triage?, * Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience)., * Proven experience as a DevOps Engineer or in a similar role, with at least 5+ years of experience in DevOps, IT infrastructure, or software engineering.

  • Hands-on, production experience with Kubernetes for at least 3+ years of time
  • Proven experience managing Data Lakes/Data Warehouses (e.g. Hadoop, Spark, Snowflake, BigQuery, etc.)
  • Experience working in a high-availability, high-performance environment, such as transportation, data analytics, or financial systems

Technical Skills:

  • Expert-level Kubernetes knowledge including cluster management, networking, storage, and security
  • Terraform proficiency for Infrastructure as Code across multiple cloud environments
  • Bazel build system experience or familiarity with large-scale build systems used in monorepo settings (Buck, NX, Turbo, Pants, etc.)
  • ArgoCD or other GitOps experience for Kubernetes deployments
  • Strong experience in identifying, diagnosing, and triaging various system issues related to performance and reliability
  • Experience with CI/CD tools, such as Github Actions and GitOps methodologies
  • Knowledge of monitoring and observability tools (Grafana, Prometheus, OTEL, etc.)

Certifications:

  • AWS Certified Solutions Architect, Kubernetes Certified Administrator, or similar certifications (preferred but not required)., * Experience with on-premise/edge deployments and hybrid cloud architectures
  • Background in transportation or other related industries deploying and debugging software
  • Knowledge of microservices architecture and container orchestration tools.
  • Familiarity with infrastructure-as-code (e.g., Terraform, AWS CloudFormation).
  • Strong understanding of networking, load balancing, and security protocols.

Benefits & conditions

Pulled from the full job description

  • Referral program
  • Health insurance
  • Retirement plan
  • Dental insurance
  • Bereavement leave, We offer a Total Rewards plan designed with you and your family's health and wellness in mind that includes:
  • Paid days off (i.e. vacation, sick days, bereavement leave)
  • Health and Dental plans
  • Retirement plans
  • Employee and Family Assistance Program (EFAP)
  • Employee referral program

Apply for this position