Head of Data Science / AI
Role details
Job location
Tech stack
Job description
We automatically ingest incoming documents, emails, and attachments. We are a fast-paced, rapidly growing startup. We are hiring a Head of Data Science/AI to drive applied machine learning and generative AI initiatives across our platform while leading and growing our Data Science team. This is a high-impact role for someone who can think strategically, manage and mentor a team, and still contribute hands-on building and evaluating production-grade models and partnering closely with Engineering and Product to deliver measurable outcomes. You will focus on Document AI including unitization, classification, and extraction and you will own our model monitoring and evaluation frameworks, ensuring we deliver world-class accuracy, scalability, and reliability in production. What you will be doing:
- Lead applied research and experimentation on document understanding models, unitization, classification, information extraction, and hybrid (LLM + traditional ML) architectures.
- Collaborate with the India-based Data Science team on model design, evaluation, and deployment strategies.
- Partner with Product, Customer Success, and Implementation teams in the U.S. to translate business needs into data-driven solutions and measurable KPIs.
- Develop and improve model monitoring and evaluation pipelines for accuracy, drift detection, and cost-performance tradeoffs.
- Explore and benchmark commercial and open-source models (OpenAI, Anthropic, Claude, Mistral, Hugging Face, etc.) for Document AI use cases.
- Design data sampling and feedback strategies (e.g., golden datasets, active learning, fine-tuning datasets) to continuously improve model performance.
- Develop frameworks for scalable experimentation, A/B testing, and prompt optimization frameworks., * Be the first Data Scientist/AI Leadership hire in the U.S., shaping the foundation for future growth.
- Work on cutting-edge Generative AI and Document AI problems at scale.
- Collaborate directly with leadership across continents.
- Contribute to a product that used daily by enterprises across the U.S.
- High ownership, autonomy, and opportunity to make measurable impact.
Requirements
- 5+ yrs of experience in applied Data Science / ML, including at least 2+ years in NLP or Document AI.
- Strong command of Python, PyTorch / TensorFlow, Hugging Face, OpenAI / Anthropic APIs, and vector databases (Qdrant, pgvector, Pinecone).
- Deep experience in document classification, entity extraction, embeddings, and text similarity.
- Strong experience in LLM prompt engineering, fine-tuning, and evaluation frameworks.
- Experience with MLOps / model evaluation pipelines (MLflow, Weights & Biases, LangFuse, or equivalent).
- Strong grasp of SQL and data modeling (PostgreSQL, RDS, etc.).
- Prior work on multimodal (image and text) document pipelines or OCR-based data extraction.
- Experience with active learning, RLHF, or auto-validation frameworks.
- Familiarity with AWS (EKS, SageMaker, Bedrock) and containerized model deployments.
- Previous experience working in a remote-first or cross-border team environment.
- Background in regulated industries