Scientific Software Developer, Data Foundry
Role details
Job location
Tech stack
Job description
We are seeking Scientific Software Developers at multiple levels to build the data infrastructure, scientific tools, and lab automation integrations that power AI-native drug discovery. You will work directly with front-line discovery scientists and data scientists to translate their needs into fit-for-purpose prototypes, data pipelines, APIs, and workflow tools-then hand off mature solutions to Tech@Lilly for enterprise scaling and maintenance if and when needed.
This role is anchored in Architecture4Insight with close collaboration across Methods4Insight and Automation & Scale4Insight . You will build the scientific software that other teams-including the Frontier AI group's autonomous agents-consume. Some developers will specialize in lab automation software : building the code that interfaces with physical instruments, robotic platforms, and scheduling systems to enable Scale4Insight's closed-loop experimentation.
Responsibilities
Scientific Data Pipelines & APIs
-
Design, build, and maintain data processing pipelines for complex scientific datasets (chemical, biological, High throughput experiments, and automation-generated data), ensuring FAIR compliance and machine-actionability.
-
Develop RESTful APIs and microservices providing unified programmatic access to LIMS, ELNs, instruments, data warehouses (Postgres, Redshift, Snowflake), and analytical databases.
-
Support continuous improvement of LIMS and adjacent systems to meet evolving scientific workflows, security, and scalability standards.
Scientific Prototyping & Tech@Lilly Handoff
-
Work directly with bench scientists to understand pain points and rapidly prototype custom applications, dashboards, and workflow tools.
-
Validate prototypes through iterative scientist feedback, ensuring solutions are fit-for-purpose before transition.
-
Partner with Tech@Lilly Product Engineering to hand off mature prototypes for enterprise scaling, defining transition criteria, documentation standards, and SLAs.
Automation Software & Lab Integration
-
Build integrations connecting lab automation equipment, scheduling systems, and instrument data streams to Data Foundry's infrastructure with proper metadata and execution traceability.
-
Develop software for robotic workflow control, instrument driver interfaces, and real-time data capture from automated platforms.
-
Create modular, reusable automation workflow components scientists can configure without writing code.
-
Support Scale4Insight's Agentic Lab by building software enabling seamless interfacing between automation platforms and AI-driven experimental planning.
Cloud Infrastructure & DevSecOps
-
Build and operate cloud-native components (AWS, Azure, or GCP) supporting containerized workflows (Kubernetes/Docker), infrastructure-as-code, CI/CD, and workflow orchestration (Prefect, Airflow, Nextflow).
-
Apply DevSecOps standards including security scanning, code review, and automated testing.
-
Participate in agile development with iterative improvement and cross-functional collaboration.
Requirements
-
B.S. or M.S. in Computer Science, Bioinformatics, Cheminformatics, Computational Biology, Chemistry, Biology, Biomedical Engineering, or related STEM field.
-
Bachelor with 3+ years and Master with 1+ years of scientific software development, with understanding of experimental data types and scientific workflows.
-
Proficiency in Python and at least one additional language (Java, C#, Go, or TypeScript); SQL skills appropriate to level.
-
Qualified applicants must be authorized to work in the United States on a full-time basis. Lilly will not provide support for or sponsor work authorization or visas for this role, including but not limited to F-1 CPT, F-1 OPT, F-1 STEM OPT, J-1, H-1B, TN, O-1, E-3, H-1B1, or L-1.
Preferred Qualifications
-
Experience (or demonstrated aptitude at junior levels) building RESTful APIs, data pipelines, and/or microservices for scientific or technical applications.
-
Familiarity with cloud platforms (AWS, Azure, or GCP), containerization (Docker/Kubernetes), and Git.
-
Strong communication skills and interest to collaborate with scientists and multi-functional teams.
-
Pharmaceutical or biotech research industry experience, particularly in discovery workflows for biology, chemistry, or automation.
-
LIMS/ELN experience (e.g., Benchling) and laboratory instrument integration.
-
Experience integrating lab automation systems with digital platforms, including instrument control, robotic workflow orchestration, or scheduling systems (OPC-UA, serial/USB protocols, automation scheduling platforms).
-
Data warehousing experience (Postgres, Redshift, BigQuery, Snowflake) and scientific data standards/ontologies.
-
Hands-on experience with cheminformatics tools (RDKit, Schrödinger, MOE) or bioinformatics platforms (Biopython, Bioconductor, sequence analysis pipelines).
-
Experience with scientific computing libraries (SciPy, NumPy) for numerical methods, ODE solvers, optimization, or PK/PD modeling workflows.
-
Workflow orchestration (Prefect, Airflow, Nextflow, WDL) and CI/CD practices.
-
Strong learning agility-willingness to step outside comfort zone and adopt new technologies to get the job done.
-
Experience with C, C++, or other compiled languages for porting performance-critical scientific workflows; ability to profile and identify computational bottlenecks.
Benefits & conditions
Actual compensation will depend on a candidate's education, experience, skills, and geographic location. The anticipated wage for this position is