Data Engineer / Apache Cassandra Administrator
PDF Solutions, Inc.
Dallas, United States of America
10 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 129KJob location
Dallas, United States of America
Tech stack
Java
Amazon Web Services (AWS)
Business Analytics Applications
Azure
Bash
Big Data
Command-Line Interface
Cloud Computing
Databases
Continuous Integration
DevOps
Disaster Recovery
Perl
Monitoring of Systems
Java Management Extensions
Python
NoSQL
Ansible
Prometheus
Shell Script
Backup and Restore
Data Processing
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Cloud Platform System
Electrical and Computer Engineering
Apache Cassandra
System Availability
Delivery Pipeline
Grafana
Containerization
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Cassandra
Puppet
Terraform
Programming Languages
Job description
As a Data Engineer at PDF, you will be part of a global team dedicated to leveraging innovative approaches and public cloud infrastructure to refine and enhance the design and architecture of our Big Data Analytics software. In this role, you will:
- Architect robust databases: Develop and maintain scalable NoSQL Apache Cassandra to support our advanced analytical solutions.
- Maintain scalable Database: Play a critical role in ensuring the high availability, performance, and scalability of our Cassandra clusters while supporting development teams and troubleshooting complex database issues.
- Contribute to scalable solutions: Design, implement, and maintain the scalable data processing components of PDF's Analytics software.
- Bring DevOps expertise: Automate infrastructure using modern tools like Terraform and Ansible, ensuring efficiency and reliability.
This position is ideal for engineers passionate about working with cutting-edge technology to drive impactful results.
Responsibilities
- Participate in the continuous efforts to improve the design and architecture of our application
- Install, configure, upgrade, and maintain Apache Cassandra clusters (on-premises and/or cloud-based)
- Monitor database health and performance using tools such as nodetool, JMX, Prometheus, Grafana, Medusa
- Perform regular backup and restore operations using native and third-party tools
- Manage compaction strategies (Size-Tiered, Leveled, Unified) and optimize read/write paths
- Proactively identify performance bottlenecks and apply tuning strategies
- Handle schema changes, keyspace/table design, and data modeling best practices
- Troubleshoot and resolve database incidents and support high availability and disaster recovery strategies
- Automate routine operations using shell scripts, Python, or Ansible
- Collaborate with developers, SREs, and DevOps teams to support CI/CD integration and deployment pipelines
- Proactively ensure the highest levels of systems and infrastructure availability
- Work closely with the Application and Database teams to resolve issues and improve customer experience
- Build and sustain high-performance databases on on-premises and Cloud infrastructure
Requirements
- Proven experience as a Cassandra Administrator or in a similar database administration role
- 3+ years of hands-on experience administering Apache Cassandra in production environments
- Experience with tools like nodetool, Medusa, Reaper, and other Cassandra management utilities
- Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, or similar solutions
- Implement robust security practices, including user authentication, authorization, and encryption for data in transit and at rest
- Experience with automation and managing infrastructure as code: terraform / Ansible / chef / puppet
- Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, or similar solutions
- Strong communication skills and the ability to work across engineering teams
- Strong troubleshooting skills
- Demonstrated ability to generate and maintain technical documentation
- Prior exposure to some programming languages like Python, Perl, Java
- Solid knowledge of Shell Scripting and command line management
- Familiarity with cloud platforms (AWS, Azure, Google Cloud Platform) and deploying Cassandra in cloud environments
- Strong understanding of Cassandra architecture (gossip, hinted handoff, replication, partitioning)
- Proficiency in scripting (e.g., Bash, Python, or Perl)
Big Plus Experiences:
- Experienced with data processing tools
- Experience with Kubernetes and running Cassandra in containerized environments
- Apache Cassandra certificate, Bachelor of Science or higher in Computer Science or Electrical and Computer Engineering preferred
About the company
PDF Solutions is redefining the way the semiconductor industry approaches data, analytics, and experience design. As part of our journey, we're building Next gen platform - a modern, human-centered analytics platform. We believe design systems aren't just about consistency - they're about scalability, collaboration, and performance.