Data Scientist (GenAI, LLM & Machine Learning)

Lorven Technologies Inc

Raleigh, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Raleigh, United States of America

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Automated Storage and Retrieval Systems

Azure

Cloud Computing

Relational Databases

Distributed Systems

Elasticsearch

Python

PostgreSQL

Machine Learning

MongoDB

Natural Language Processing

Named Entity Recognition

NoSQL

Performance Tuning

TensorFlow

Search Technologies

Systems Integration

Google Cloud Platform

PyTorch

Large Language Models

Prompt Engineering

Spark

Deep Learning

Model Validation

Generative AI

Keras

Containerization

Kubernetes

Information Technology

HuggingFace

Cosmos DB

Machine Learning Operations

Api Design

Spacy

Document Classification

Natural Language Understanding

Data Pipelines

Serverless Computing

Docker

Job description

Design, develop, and deploy enterprise-grade AI/ML and Generative AI solutions leveraging Large Language Models (LLMs), NLP techniques, and advanced machine learning methodologies.
Build and optimize Retrieval-Augmented Generation (RAG) pipelines, prompt engineering frameworks, vector embedding solutions, and knowledge retrieval systems.
Develop AI applications tailored for legal document intelligence, document processing, search, summarization, and classification use cases.
Design and implement data pipelines for ingestion, preprocessing, annotation, enrichment, and management of structured and unstructured datasets.
Collaborate closely with legal domain experts, business stakeholders, and engineering teams to understand requirements and translate them into scalable AI solutions.
Conduct model experimentation, benchmarking, evaluation, and performance optimization to improve accuracy, reliability, and business outcomes.
Develop and maintain machine learning models using PyTorch, TensorFlow, Keras, Hugging Face Transformers, and other modern AI frameworks.
Implement NLP solutions involving entity extraction, semantic search, document classification, embeddings, and language understanding tasks.
Build and optimize integrations with vector databases, search platforms, relational databases, and cloud-native services.
Work with AWS, Azure, or Google Cloud Platform services to deploy, monitor, and scale AI/ML workloads in production environments.

Requirements

Bachelor's or Master's degree in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Statistics, or a related field with 6-8+ years of experience in Data Science, Machine Learning, and AI solution development with overall 12-14+ years of experience.
6+ years of hands-on experience designing, developing, and deploying machine learning models and advanced analytics solutions in enterprise environments.
Strong experience with Large Language Models (LLMs), Generative AI, Prompt Engineering, Retrieval-Augmented Generation (RAG), and model evaluation frameworks.
Advanced proficiency in Python with experience developing scalable AI/ML applications and data processing pipelines.
Hands-on experience with deep learning frameworks including PyTorch, TensorFlow, Keras, and Hugging Face Transformers.
Strong expertise in Natural Language Processing (NLP) techniques and tools such as spaCy, BERT, Word2Vec, Transformers, Flair, and text classification models.
Experience building and maintaining training, validation, benchmarking, and evaluation datasets for AI/ML initiatives.
Knowledge of vector databases and search technologies including ChromaDB, Elasticsearch, OpenSearch, or similar platforms.
Experience working with relational and NoSQL databases such as PostgreSQL, MongoDB, Cosmos DB, or equivalent.
Experience with cloud platforms including AWS, Azure, or Google Cloud Platform for model development, deployment, and scaling.
Understanding of data modeling principles, embeddings, clustering, dimensionality reduction, sequence classification, and predictive analytics.
Exposure to distributed computing technologies such as Spark, Ray, or Scala is highly preferred.
Experience with API development, containerization (Docker/Kubernetes), and MLOps/AIOps practices is highly preferred.
Strong analytical, problem-solving, communication, and stakeholder collaboration skills.

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all