Senior Data Scientist
Role details
Job location
Tech stack
Job description
Our dedicated Data Science team is at the forefront of revolutionizing pharma intelligence and how patients gain access to life-saving therapies. Armed with cutting-edge technology and a passion for innovation, we leverage the vast landscape of data to extract actionable insights that drive informed decision making.
Our unique collaborative approach fosters a dynamic synergy between data science and product development. Our deep expertise in machine learning, artificial intelligence, large language models, and generative AI, combined with our domain knowledge, enables us to deliver comprehensive, production-grade AI solutions that empower our clients to stay ahead in a rapidly evolving industry., In this role as a Senior Data Scientist, you will:
- Design and deploy production-ready AI systems that leverage LLMs and advanced ML techniques to solve complex business problems across pharma intelligence
- Build and maintain multi-agent systems and agentic orchestration workflows using frameworks like LangChain, LangGraph, or AutoGen to execute autonomous tasks
- Develop and optimize Retrieval-Augmented Generation (RAG) pipelines, ensuring high-fidelity context retrieval and vector database management
- Implement and extend MCP (Model Context Protocol) servers to allow LLMs to interact safely and efficiently with local and remote data sources
- Architect robust, scalable APIs and microservices to serve AI features to end-users with low latency (FastAPI or similar)
- Collaborate with product partners and other scientists to identify new opportunities to apply AI / ML to our content and products
- Conduct research and identify AI / ML algorithms and methods to solve specific business problems, and deliver these algorithms as microservices in collaboration with content and product engineering teams
- Implement rigorous testing and evaluation frameworks for LLM outputs to ensure prompt stability, prevent regressions, and manage hallucination risks
- Contribute towards the common data science platform
- Stay up-to-date, constantly learning about advances in the field, and deliver periodic presentations to internal teams on these developments
- All other duties, as assigned
Requirements
- 5+ years of experience developing AI / ML applications and data driven solutions, preferably in regulated industries (pharma, legal, financial services, or energy)
- Graduate degree in Computer Science, Engineering, Statistics or a related quantitative discipline, or equivalent work experience
- Substantial depth and breadth in NLP, Deep Learning, Generative AI, LLMs, and other state of the art AI / ML techniques
- Deep experience with LLM orchestration frameworks such as LangChain, LlamaIndex, or similar libraries
- Expert-level knowledge of LLM APIs (OpenAI, Anthropic Claude) and open-source models (Llama, Mistral)
- Deep understanding of CS fundamentals, computational complexity and algorithm design
- Experience with building large-scale distributed systems in an agile environment and the ability to build quick prototypes
- Excellent knowledge of Python and core data science and AI libraries including Pandas, NumPy, PyTorch, and similar
- Experience building or utilizing Model Context Protocol (MCP) servers to bridge models with data tools
- Strong background in scalable backend environments (Docker, Kubernetes, AWS/GCP)
- Experience moving AI from prototype to production-grade services with monitoring, logging, and rate-limiting
- Ability to independently conduct research and develop appropriate algorithmic solutions to complex business problems
- Experience mentoring junior team members
- Excellent problem solving and communication skills, * Knowledge of the healthcare / pharma domain and experience with applying AI to healthcare data
- Experience with AWS, especially ECS, Bedrock, API Gateway, SageMaker, serverless compute and storage such as S3 and Snowflake
- Proficiency with vector databases such as Pinecone, Qdrant, or similar for high-performance retrieval
- Experience with RAG patterns, prompt engineering, model fine tuning, and knowledge graphs
- Experience with unstructured document processing (legal document analysis, contract management, data retrieval)
- Experience with Big Data tools like Apache Spark, Hadoop, or Databricks
Benefits & conditions
- Medical and Prescription Drug Benefits
- Health Savings Accounts (HSA) or Flexible Spending Accounts (FSA)
- Dental & Vision Benefits
- Basic Life and AD&D Benefits
- 401k Retirement Plan with Company Match
- Company Paid Short & Long-Term Disability
- Paid Parental Leave
- Open Vacation Policy& Company Holidays
Please Note - All candidates must be authorized to work in the United States. We do not provide visa sponsorship or transfers. We are not currently accepting candidates who are on an OPT visa.
_The expected base salary for this position ranges from $170,000 to $180,000. It is not typical for offers to be made at or near the top of the range. Salary offers are based on a wide range of factors including relevant skills, training, experience, education, and, where applicable, licensure or certifications obtained. Market and organizational factors are also considered. In addition to base salary and a competitive benefits package, successful candidates are eligible to receive a discretionary bonus.