Senior Software Engineer, Data Platform

PROFLUENT
Emeryville, United States of America
21 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 220K

Job location

Emeryville, United States of America

Tech stack

Testing (Software)
Audit Trail
Google BigQuery
Bioinformatics
Code Review
Computational Biology
Continuous Integration
Data Validation
Information Engineering
Data Governance
Data Infrastructure
Data Security
Data Warehousing
Experimental Data
Python
Laboratory Information Management Systems
PostgreSQL
Machine Learning
Meta-Data Management
Metadata Repositories
Operational Databases
Software Deployment
Software Engineering
Management of Software Versions
Workflow Management Systems
Software Organization
Google Cloud Platform
GIT
Containerization
Information Technology
Build Tools

Job description

  • Design, build, and maintain scalable data infrastructure for protein engineering campaigns, including ingestion, transformation, validation, storage, and retrieval of large scientific datasets
  • Develop secure data pipelines for internal and partner-generated data, with strong attention to access control, data siloing, provenance, auditability, and compliance with data use restrictions
  • Own core components of Profluent's data warehouse and data platform, using Python, GCP, PostgreSQL, BigQuery, and related cloud-native technologies
  • Build systems that transform raw experimental, computational, and partner data into structured, reliable, analysis-ready and model-ready datasets
  • Establish best practices for data modeling, metadata management, data quality checks, schema evolution, versioning, and documentation
  • Collaborate with ML engineers, computational biologists, data scientists, and program stakeholders to understand data requirements and translate them into scalable technical systems
  • Improve engineering quality through thoughtful system design, code review, testing, CI/CD, observability, and maintainable development workflows
  • Contribute to architectural decisions for how Profluent stores, secures, organizes, and uses data across programs and partnerships

Requirements

Do you have experience in Software testing?, Do you have a Bachelor's degree?, * 5+ years of software engineering, data engineering, or data platform experience

  • Strong proficiency in Python and modern software development practices, including git, testing, code review, CI/CD, and production deployment
  • Experience designing and operating production data pipelines, data warehouses, and data models at scale
  • Hands-on experience with cloud platforms, preferably GCP, and technologies such as BigQuery, PostgreSQL, object storage, workflow orchestration, and containerized services
  • Strong understanding of data security, access control, data partitioning or siloing, audit logging, and managing sensitive or restricted datasets
  • Experience working with complex, heterogeneous datasets and building systems that make them reliable, discoverable, and usable
  • Ability to work independently, make sound technical decisions, and drive projects from ambiguous requirements to production systems
  • BS, MS, or PhD in Computer Science, Engineering, Data Science, Bioinformatics, or a related technical field, or equivalent practical experience

Preferences (but not required)

  • Experience with scientific, biological, clinical, genomic, laboratory, or high-throughput experimental data
  • Experience managing external partner, customer, or restricted-access datasets
  • Familiarity with data governance, lineage, metadata systems, schema registries, or data catalogs
  • Experience with research data systems, LIMS, ELNs, Benchling, or adjacent scientific platforms
  • Background working with ML, data science, computational biology, or cross-disciplinary technical teams
  • Interest in learning biology, gene editing, protein design, or machine learning concepts, Applicants must have ongoing work authorization in the United States that does not require employer sponsorship. Sponsorship will not be provided now or at any time in the future for this position.

Benefits & conditions

Pulled from the full job description

  • Health insurance
  • 401(k) matching
  • Paid time off
  • Vision insurance
  • Dental insurance, * Competitive compensation package with equity participation
  • 401(k) with a strong employer match
  • Comprehensive benefits including health/dental/vision insurance
  • Generous PTO policy and commitment to work-life balance
  • Professional development opportunities in a cutting-edge field at the intersection of AI and biology

About the company

Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Emeryville, CA, we are backed by leading investors including Altimeter Capital, Bezos Expeditions, Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures, and have raised over $150M to date.

Apply for this position