Systems Engineer Platform - Messaging Platform

O'Reilly Automotive, Inc.
Springfield, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Springfield, United States of America

Tech stack

API
Agile Methodologies
Apache HTTP Server
Systems Engineering
Confluence
JIRA
Audit Trail
Authentication Protocols
Cloud Computing
Information Systems
System Configuration
Continuous Integration
Serialization
DevOps
Disaster Recovery
Distributed Systems
Elasticsearch
Data Flow Control
Github
Protocol Buffers
IBM WebSphere MQ
Identity and Access Management
Virtual Private Networks (VPN)
Java Database Connectivity
JSON
Kerberos (Protocol)
Lightweight Directory Access Protocols (LDAP)
Enterprise Messaging Systems
Metadata
PCI Data Security Standards
RabbitMQ
Role-Based Access Control
Cloud Services
Ansible
Prometheus
Message Oriented Middleware
Secure Messaging
Single Sign-On
Software Engineering
Data Streaming
Apache Zookeeper
Network Switches
Google Cloud Platform
Data Classification
Istio
System Availability
Grafana
HybridCloud
Amazon Web Services (AWS)
GIT
Event Driven Architecture
Debezium
Kubernetes
Information Technology
Collibra
Low Latency
Apache Flink
Deployment Automation
Avro
Real Time Data
Kafka
Linkerd (Service Mesh)
Spark Streaming
Asynchronous Programming
Terraform
Stream Processing
Data Pipelines
Docker
ELK
Legacy Systems
Jenkins
Confluent
Microservices

Job description

The Sr Systems Engineer Platform - Messaging Platform will play a key role in designing, implementing, and maintaining enterprise messaging systems that support both real-time and asynchronous communication between distributed applications and services. This role focuses on ensuring robust, scalable, secure, and cost-efficient messaging solutions across hybrid cloud and on-premises environments. The engineer will work closely with architects, developers, infrastructure, and DevOps teams to standardize messaging platforms and promote modern integration patterns that align with the organization's digital transformation goals., * Design, implement, and support scalable messaging platforms including Apache Kafka (Confluent/OSS), Google Cloud Pub/Sub, and legacy systems such as IBM MQ.

  • Build high-throughput, low-latency event streaming pipelines that support mission-critical workloads, leveraging Kafka brokers, topics, partitions, and consumer groups.
  • Define and enforce schema governance using tools like Confluent Schema Registry or Apicurio; enforce consistent serialization formats (e.g., Avro, JSON, Protobuf).
  • Standardize topic taxonomy and hierarchy across business domains, enforce naming conventions, and implement lifecycle management practices for topics and subscriptions.
  • Manage Kafka Connect connectors (source/sink), KSQLDB flows, and stream processing topologies using Kafka Streams.
  • Define and implement platform-wide policies for message retention, compaction, ACLs, multi-tenancy isolation, and access control.
  • Ensure reliable message delivery through producer retries, dead-letter queues, idempotency handling, and exactly-once semantics where applicable.
  • Automate deployment and configuration of messaging infrastructure using Terraform, Helm, Ansible, and Kubernetes operators (e.g., Strimzi for Kafka).
  • Maintain Git-based configuration-as-code repositories to drive consistency and auditability across environments.
  • Develop CI/CD pipelines to support promotion of configuration artifacts, rolling upgrades of messaging clusters, and dynamic provisioning of topics and consumer policies.
  • Implement proactive drift detection, self-healing scripts, and platform bootstrapping workflows.
  • Integrate messaging platforms with enterprise IAM (e.g., Google Cloud Platform IAM, LDAP, Kerberos, or RBAC for Confluent/Kafka).
  • Implement encryption in transit and at rest using TLS, mTLS, SASL/SCRAM, and Kafka-level ACLs.
  • Implement observability dashboards and alerts using Prometheus, Grafana, Confluent Control Center, ELK Stack, or Google Cloud Operations Suite.
  • Establish performance baselines and configure SLA/SLO-based monitoring for producers, consumers, brokers, and ZooKeeper.
  • Partner with development teams to onboard producer and consumer applications through reusable connectors, API-to-event bridges, and SDKs.
  • Provide technical guidance and documentation on best practices for schema evolution, idempotent messaging, and replayable data streams.
  • Lead efforts to decommission legacy messaging platforms (e.g., MQSeries, RabbitMQ) and consolidate onto modern event streaming technologies.
  • Evaluate emerging messaging technologies (e.g., Pulsar, Redpanda) for specific workloads or cost optimization opportunities.

Requirements

Required:

  • 5+ years of hands-on experience with enterprise-grade messaging platforms, including Apache Kafka (Confluent/OSS), Google Cloud Pub/Sub, or equivalent.
  • Deep expertise in Kafka architecture including broker management, partitioning, replication, log retention, and consumer group coordination.
  • Proficiency in schema evolution and contract validation using Confluent Schema Registry, Apicurio, or similar tools.
  • Demonstrated experience with Kafka Connect, Kafka Streams, and stream processing applications for real-time data movement.
  • Strong knowledge of topic hierarchy management, naming conventions, access control (ACLs), and governance policies.
  • Experience designing and deploying messaging platforms for mission-critical, high-throughput, low-latency event-driven systems.
  • Experience deploying and managing messaging workloads on Google Cloud Platform using Google Cloud Pub/Sub, Eventarc, or custom solutions.
  • Familiar with hybrid architectures including secure messaging between on-prem and cloud workloads via VPN/Interconnect.
  • Hands-on implementation of Google Cloud Platform IAM, VPC Service Controls, and secure endpoint management for messaging components.
  • Understanding of regional and multi-region messaging designs for high availability and disaster recovery.
  • Skilled in automating infrastructure provisioning and configuration using Terraform, Helm, and Kubernetes (e.g., Strimzi operators).
  • CI/CD experience with GitHub Actions, Jenkins, or Google Cloud Build for promoting messaging configurations and application artifacts.
  • Strong understanding of data protection practices including TLS/mTLS encryption, SASL/SCRAM, and token-based authentication.
  • Integration with enterprise IAM providers (e.g., Google Cloud Platform IAM, LDAP, SSO, RBAC) and audit logging tools.
  • Familiarity with industry regulations such as PCI-DSS, SOC2, HIPAA, and practices such as data classification and lineage.
  • Experience building reusable onboarding assets, SDKs, and best practice guides for application teams.
  • Development of shared connectors, bridges (e.g., API-to-Kafka), and schema-first design pipelines.
  • Exposure to alternative messaging systems like Apache Pulsar, Redpanda, RabbitMQ, and NATS.

Desired:

  • Experience deploying and managing Apache Kafka on Kubernetes (e.g., using Strimzi or Confluent Operator), including custom broker configurations and scaling strategies.
  • Familiarity with containerization using Docker and orchestration using GKE or other Kubernetes-based platforms.
  • Exposure to multi-tenant Kafka architectures with namespace isolation, quota enforcement, and security boundaries.
  • Strong understanding of event-driven architecture (EDA), microservices communication patterns, and asynchronous messaging design.
  • Hands-on experience with real-time data processing frameworks like Apache Flink, Kafka Streams, or Spark Structured Streaming.
  • Experience configuring and tuning Kafka Connect connectors (e.g., Debezium, JDBC, Elasticsearch) for data pipeline integration.
  • Proficiency in schema evolution practices using Avro, Protobuf, and tools like Confluent Schema Registry or Apicurio.
  • Familiarity with service mesh technologies (e.g., Istio, Linkerd) and secure east-west traffic routing within cloud-native messaging environments.
  • Knowledge of enterprise messaging migration strategies from legacy platforms such as IBM MQ or RabbitMQ to cloud-native solutions.
  • Experience using Google Cloud Pub/Sub and Dataflow for scalable event ingestion, transformation, and streaming analytics.
  • Knowledge of access control and authentication mechanisms such as Kafka ACLs, SASL/SCRAM, Kerberos, or Google Cloud Platform IAM.
  • Awareness of message delivery semantics (at-least-once, exactly-once) and how to implement idempotency in distributed systems.
  • Hands-on exposure to observability stacks such as PrometheGrafana, Confluent Control Center, and Google Cloud Platform Monitoring for tracking message flow, latency, and system health.
  • Experience building and maintaining automated CI/CD pipelines for Kafka topic provisioning, ACL management, and connector deployment.
  • Familiarity with metadata and governance platforms such as Dataplex, Google Data Catalog, or Collibra in the context of messaging assets.
  • Industry experience in domains such as retail, supply chain, or financial services with large-scale streaming use cases.
  • Certifications related to Apache Kafka (e.g., Confluent Certified Developer/Administrator) or Google Cloud (e.g., Professional Cloud Developer).
  • Understanding of compliance standards (e.g., PCI-DSS, ISO 27001) as they relate to secure messaging.
  • Comfortable working in agile delivery models with tools like JIRA, Confluence, and participating in sprints and architecture reviews.
  • Ability to lead design discussions, provide technical mentoring, and promote platform adoption across enterprise teams.
  • Bachelor's Degree in Computer Science, Software Engineering, Information Systems, or equivalent practical experience.

O'Reilly Auto Parts has a proven track record of growth and stability. O'Reilly is full of successful career stories and believes in a strong promote-from-within philosophy, encouraging you to grow your career along with the organization.

Benefits & conditions

Total Compensation Package:

  • Competitive Wages & Paid Time Off
  • Stock Purchase Plan & 401k with Employer Contributions Starting Day One
  • Medical, Dental, & Vision Insurance with Optional Flexible Spending Account (FSA)
  • Team Member Health/Wellbeing Programs
  • Tuition Educational Assistance Programs
  • Opportunities for Career Growth

Apply for this position