Accumulo Database Engineer SME
Role details
Job location
Tech stack
Job description
P3S Corporation is seeking a highly skilled Senior Accumulo Database Engineer / Subject Matter Expert (SME) to support mission-critical Department of Defense and Intelligence Community systems within a TS/SCI environment. This position serves as the technical authority for Apache Accumulo architecture, administration, optimization, security, and integration across large-scale enterprise data platforms.
The ideal candidate possesses deep expertise in Apache Accumulo, Hadoop ecosystems, HDFS, distributed computing architectures, data governance, and secure data management. Candidates may come from a Database Administration, Software Engineering, Data Engineering, or Big Data Engineering background but must demonstrate expert-level Accumulo knowledge and hands-on operational experience.
This role supports high-volume ingest, secure data storage, advanced analytics, and mission systems requiring high availability, performance optimization, and compliance with Intelligence Community and Department of Defense security requirements., Apache Accumulo Administration & Engineering
- Serve as the primary technical expert for Apache Accumulo architecture and operations.
- Design, deploy, configure, maintain, and optimize enterprise Accumulo clusters.
- Administer and troubleshoot:
- Tablet Servers
- Master Services
- ZooKeeper
- HDFS
- Write-Ahead Logs (WAL)
- Metadata Tables
- Manage tablet assignment, balancing, compactions, and recovery operations.
- Develop operational procedures supporting cluster health and resiliency.
Data Architecture & Modeling
- Design scalable Accumulo schemas and row-key strategies.
- Optimize:
- Scan performance
- Load distribution
- Hotspot avoidance
- Data locality
- Implement:
- Locality Groups
- Bloom Filters
- Time-series partitioning strategies
- High-ingest architectures
Security & Data Governance
- Implement and manage Accumulo Cell-Level Security.
- Design and administer:
- Visibility Labels
- Authorizations
- Classification schemas
- Access control policies
- Support compliance with:
- RMF
- NIST 800-53
- DoD Cybersecurity Policies
- Intelligence Community security requirements
- Maintain auditability, lineage tracking, and data retention controls.
Performance Optimization & Troubleshooting
- Monitor and tune Accumulo cluster performance.
- Analyze:
- Tablet distribution
- Cache utilization
- Garbage Collection activity
- Compaction queues
- WAL performance
- HDFS health
- Resolve performance bottlenecks impacting ingestion, querying, and storage.
- Develop monitoring dashboards and performance baselines.
Bulk Ingest & Data Pipelines
- Design and support high-volume data ingestion architectures.
- Implement:
- Bulk Imports
- RFiles
- MapReduce Jobs
- Hadoop Workflows
- Spark Pipelines
- Develop validation strategies ensuring data integrity and consistency.
- Optimize ingest throughput while minimizing impact to query performance.
Iterator Development
- Design, implement, and maintain custom Accumulo iterators.
- Develop:
- Aggregation Iterators
- Filtering Iterators
- Age-Off Iterators
- Combiners
- Evaluate server-side versus client-side processing approaches.
- Optimize iterator stack performance.
Database Administration
- Manage enterprise databases including:
- Apache Accumulo
- PostgreSQL
- MongoDB
- Implement backup and recovery strategies.
- Support disaster recovery planning and execution.
- Maintain high availability and system reliability.
System Integration
- Collaborate with software development teams utilizing:
- Java
- Spring Framework
- REST APIs
- Big Data Platforms
- Support enterprise application integration and data services.
Operations & Lifecycle Management
- Perform upgrades and migrations for:
- Accumulo
- Hadoop
- ZooKeeper
- HDFS
- Develop rolling upgrade strategies minimizing operational downtime.
- Create technical documentation, SOPs, architecture diagrams, and operational runbooks.
Requirements
Do you have experience in ZooKeeper?, Do you have a Bachelor's degree?, Bachelor's Degree in one of the following:
- Computer Science
- Information Systems
- Software Engineering
- Data Engineering
- Computer Engineering
- Related Technical Discipline
(Additional experience may be substituted for education where permitted.)
Experience
- 8+ years of enterprise database administration or engineering experience.
- 5+ years supporting Hadoop ecosystem technologies.
- 5+ years supporting distributed data platforms.
- 3+ years of hands-on Apache Accumulo administration and engineering experience.
- Experience supporting classified DoD or Intelligence Community environments.
Required Technical Skills
Expert knowledge of:
- Apache Accumulo
- Hadoop Ecosystem
- HDFS
- ZooKeeper
- RFiles
- Tablet Architecture
- Major and Minor Compactions
- Write Ahead Logs (WAL)
- Data Modeling
- Java
- Linux Administration
- PostgreSQL
- MongoDB
- Security Hardening
- Backup & Recovery
- Performance Tuning
Desired Qualifications
Preferred certifications include:
- CISSP
- CISM
- CASP+
- Certified Hadoop Administrator
- Cloudera Certification
- MongoDB DBA Certification
- PostgreSQL Certification
- AWS Data Analytics Specialty
- Certified Data Management Professional (CDMP)
Preferred experience:
- Intelligence Community data platforms
- GEOINT, SIGINT, or Cyber Operations environments
- Spark
- NiFi
- Elasticsearch
- Kubernetes
- Cloud-hosted Big Data architectures