Application Developer

California Institute of Technology
Pasadena, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Remote
Pasadena, United States of America

Tech stack

JavaScript
Artificial Intelligence
Amazon Web Services (AWS)
Data analysis
Code Coverage
Databases
Continuous Integration
D3.js
Data Systems
Data Visualization
Relational Databases
Database Queries
Python
PostgreSQL
NumPy
OAuth
OpenID
Query Optimization
Scientific Computating
SciPy
Secure Coding
Web Application Security
Software Deployment
SQL Databases
Visual Analytics
Web Applications
WebSocket
Data Logging
Retrieval-Augmented Generation
Flask
Large Language Models
Prompt Engineering
Database Performance
Generative AI
Indexer
Backend
FastAPI
Pandas
Matplotlib
Containerization
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Deployment Automation
Plotly
Celery
Front End Software Development
Virtual Agents
REST
Stream Processing
Docker

Job description

NExScI at Caltech/IPAC has an opening for an AI Application Developer to lead the development, production deployment, and scaling of TAPchat - an AI-powered conversational interface that helps astronomers query and explore data from NASA's astronomical archives. Beyond TAPchat, this role will identify and develop new applications for AI and large language models across IPAC's data archives and scientific workflows. Come be a part of the team that is helping astronomers and data scientists all over the world access and explore astronomy data!

TAPchat uses large language models to translate natural-language questions into database queries, data analysis, and visualizations. The application is a working prototype built with Python/FastAPI and JavaScript that supports multiple LLM providers (Anthropic Claude, OpenAI, Google Gemini, and local models via Ollama), authenticates users via OAuth2, and runs in Docker with PostgreSQL. You will take ownership of this codebase, harden it for production use, and scale it to serve a growing community of researchers.

This role will work within a vibrant team of scientists and developers at the NASA Exoplanet Science Institute (NExScI). As a part of IPAC, NExScI (nexsci.caltech.edu) provides archive services, community support, science operations, and analysis tools related to the discovery and characterization of planets beyond our solar system (exoplanets) using data from observatories in space and on the ground. IPAC also hosts several other NASA and Caltech data archives, and this position will have the opportunity to bring AI-driven tools and approaches to those projects as well.

Essential Job Duties As an Applications Developer, your primary focus will be leading TAPchat to production, while also exploring AI/LLM applications across IPAC. Key responsibilities:

  • Take ownership of an existing Python/FastAPI + JavaScript codebase and bring it from prototype to production-quality service.
  • Design and improve the AI agent loop - prompt engineering, tool design, retrieval-augmented generation (RAG) over archive documentation, multi-model support, and evaluation of LLM responses for scientific accuracy.
  • Design and implement secure code execution sandboxing to safely run user-initiated data analysis (e.g., subprocess isolation, containers, or task queues).
  • Improve test coverage, CI/CD pipelines, and deployment automation.
  • Optimize database performance (indexing, connection pooling, query optimization) and implement monitoring, logging, and alerting.
  • Scale the application to support a growing user base, including evaluating cloud deployment options and horizontal scaling strategies.
  • Implement rate limiting, security hardening, and operational tooling for a public-facing service.
  • Collaborate with scientists across IPAC to add new archive integrations, improve the AI agent's tool suite, and enhance data visualization capabilities (e.g., migrating from static plots to interactive visualizations with Bokeh or Plotly).
  • Write and maintain deployment documentation, runbooks, and architecture decision records.
  • Identify opportunities to apply AI and LLM technologies to other IPAC data archives and scientific workflows - prototyping new tools, evaluating feasibility, and championing adoption.
  • Stay current with the rapidly evolving AI/LLM landscape and advise the team on which new capabilities (e.g., multimodal models, retrieval-augmented generation, fine-tuning) are worth adopting.

Requirements

  • Bachelor's or equivalent degree in Computer Science, Data Science, Astronomy/Astrophysics, or related field.
  • A minimum of 3 years of relevant professional experience.
  • Proficiency in Python, including modern web frameworks (FastAPI, Flask, or similar).
  • Experience integrating with LLM APIs (Anthropic Claude, OpenAI, Google Gemini, or similar) and an understanding of prompt engineering, retrieval-augmented generation (RAG), and agentic AI patterns.
  • Experience developing and deploying web applications, with willingness to work across the stack (backend, database, frontend, infrastructure).
  • Strong communication and interpersonal skills.

Preferred Qualifications These additional qualifications may give you a strong start, though we still encourage you to apply even if you don't have all of them:

  • Experience with relational databases (PostgreSQL or similar), including schema design and SQL.
  • Experience with Docker, containerized deployments, and CI/CD pipelines.
  • Familiarity with REST API design and real-time data streaming (SSE or WebSockets).
  • Comfort working with JavaScript for frontend development.
  • Master's or PhD in Computer Science, Data Science, Astronomy/Astrophysics, or related field.
  • Experience building agentic AI applications with tool use, structured outputs, and evaluation frameworks.
  • Familiarity with the Python scientific computing ecosystem (pandas, NumPy, SciPy, Astropy, Matplotlib).
  • Experience with process sandboxing, security isolation, or task queue systems (Celery, ARQ, Dramatiq).
  • Experience with cloud platforms (AWS, GCP) and infrastructure-as-code tools.
  • Knowledge of OAuth2/OIDC authentication flows and web application security.
  • Experience with Kubernetes or similar container orchestration at scale.
  • Background in astronomy, astrophysics, or scientific data systems.
  • Experience with interactive data visualization libraries (Bokeh, Plotly, D3.js).

About the company

Caltech is a world-renowned science and engineering institute that marshals some of the world's brightest minds and most innovative tools to address fundamental scientific questions. We thrive on finding and cultivating talented people who are passionate about what they do. Join us and be a part of the diverse Caltech community., People choose to work at IPAC for many reasons, and the casual, employee-centric culture often leads to fulfilling, long-term careers and positive relationships. Caltech's benefits program offers a quality, competitive benefits package that is affordable for you and the Institute. We also offer a 403(b) defined contribution plan to eligible staff as well as a Voluntary Retirement Savings (TDA) Plan. IPAC staff have access to the Institute's facilities, including the athletic center, libraries, on-site daycare, professional development and enrichment classes, and Athenaeum club membership.

Apply for this position