Senior Database DBA - MemSQL / SingleStore

Qode LLC
Jackson Township, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Jackson Township, United States of America

Tech stack

Amazon Web Services (AWS)
Azure
Bash
Cloud Computing
Databases
Data Distribution Service
Database Security
Shard (Database Architecture)
Linux
Disaster Recovery
Distributed Data Store
Python
Node.js
Performance Tuning
Prometheus
Datadog
Scripting (Bash/Python/Go/Ruby)
System Availability
Grafana
Kubernetes
Splunk
MemSQL

Job description

End-to-end ownership of large MemSQL/SingleStore clusters (design, build, upgrade, operate, decommission).

Architect and maintain High Availability (HA) and Disaster Recovery (DR) setups including:

  • Redundancy levels
  • Availability groups
  • Cross-region replication

Plan and execute:

  • Cluster expansion
  • Downsizing
  • Online partition rebalancing
  • Leaf node management with minimal/no downtime

Proactively monitor cluster health, throughput, latency, and capacity; define and maintain SLAs.

Perform advanced performance tuning:

  • Schema design
  • Shard key design
  • Index strategy
  • NUMA and memory tuning
  • Workload management

Implement backup/restore strategies and regularly test DR & failover.

Lead incident response and perform deep root cause analysis.

Enforce database security best practices:

  • Authentication & authorization
  • Encryption
  • Auditing
  • Network controls

Drive automation using scripting (Python/Bash) and Infrastructure as Code.

Maintain documentation, operational runbooks, and standards.

Evaluate new MemSQL/SingleStore features and lead version upgrades and migrations.

Requirements

Do you have experience in System tuning?, We are seeking a Senior MemSQL / SingleStore Cluster Administrator to own and manage mission-critical, large-scale distributed database platforms. This role requires a pure Database Administrator (DBA) with deep expertise in handling petabyte-scale data, complex distributed clusters, and real-time latency-sensitive workloads.

Core Technical Expectations

Experience handling petabytes of data ingested every 15 minutes in large-scale environments.

Strong expertise managing large MemSQL / SingleStore clusters (multi-node, multi-TB to multi-PB).

Deep understanding of data distribution across aggregators and leaf nodes.

Expertise in:

  • Partitioning and shard key strategy
  • Data skew mitigation
  • Hot partition resolution
  • Worker node and leaf node optimization

Strong table-level knowledge including:

  • Index strategy
  • Thread management
  • Connection pooling
  • Memory limits
  • Query plan optimization

Strong understanding of different MemSQL/SingleStore versions and corresponding architectural/feature changes., 10+ years of total database engineering/administration experience.

4-5+ years of deep, production-grade experience administering MemSQL/SingleStore clusters at scale.

Strong hands-on experience with:

  • Aggregators & leaf nodes

  • Licensing and memory limits

  • Cluster expansion & partition rebalancing

  • Replication & failover/failback

  • Proven ability to diagnose:

  • Locking issues

  • Data skew

  • Hot partitions

  • Bad execution plans

  • Strong Linux system tuning knowledge:

  • CPU/NUMA affinity

  • Disk & I/O optimization

  • Networking

  • ulimits & OS-level tuning

  • Experience with monitoring & alerting tools:

  • Prometheus / Grafana

  • Datadog

  • Splunk

  • ELK

  • Strong SQL expertise and scripting (Python/Bash).

  • Experience in Cloud/Container environments (AWS/Azure/GCP, Kubernetes) is highly preferred.

Excellent communication skills with ability to lead production calls and explain technical trade-offs clearly.

Apply for this position