Apache Druid + Trino + DevOps + Python Demand
ProCorp Systems Inc.
Sunnyvale, United States of America
4 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Sunnyvale, United States of America
Tech stack
Amazon Web Services (AWS)
Apache HTTP Server
User Authentication
Azure
Cloud Computing
DevOps
Memory Management
Github
Hive
Java Database Connectivity
Python
Octopus Deploy
Performance Tuning
Query Optimization
Prometheus
Data Processing
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Grafana
Kubernetes Helm Charts
Cloudformation
Containerization
Data Lake
Gitlab-ci
Kubernetes
Druid
Deployment Automation
Production Code
Kafka
REST
Terraform
Dynatrace
Docker
Jenkins
Job description
- Apache Druid
- Cluster setup, configuration, and production operations
- Real-time and batch ingestion (Kafka, streaming tasks, indexing services)
- Segment management, compaction, retention, and query optimization
- Troubleshooting performance and availability issues
- Trino
- Cluster deployment and tuning for large-scale distributed queries
- Connector configuration (Hive, Iceberg, Delta Lake, JDBC, etc.)
- Query optimization, memory management, and workload isolation
- Security configuration (authentication, authorization, access control)
Requirements
- Strong proficiency in Python for automation and backend services
- Writing clean, maintainable, production-grade code
- Building tooling for deployment, monitoring, and operational workflows
- Experience with REST APIs, scripting, and data processing libraries
DevOps & Platform Engineering
- Containerization & Orchestration
- Docker image creation and optimization
- Kubernetes deployment, scaling, and troubleshooting
- Helm charts and Kubernetes operators (preferred)
- Infrastructure & CI/CD
- Infrastructure as Code using Terraform, CloudFormation, or similar
- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.)
- Blue-green and rolling deployment strategies
- Cloud Platforms
- Hands-on experience with AWS, Google Cloud Platform, or Azure
- Networking, storage, and compute optimization for data workloads
- Cost monitoring and optimization
Observability & Operations
- Monitoring and alerting using Prometheus, Grafana, ELK, OpenTelemetry, or similar
- Log aggregation, metrics, and distributed tracing
- Incident management, root cause analysis, and postmortems
- Capacity planning and performance benchmarking