Senior Data Scientist II
Role details
Job location
Tech stack
Job description
- Work on new product development. Propose and build data-driven solutions for high-value customer problems by discovering, extracting, and modeling knowledge from large-scale natural language datasets. Prototype new ideas, collaborate with other data scientists as well as product designers, data engineers, front-end developers, and a team of expert legal data annotators. Evaluate and help maintain our data assets and training/evaluation data sets. Develop and implement NLP-based information extraction solutions. Propose and identify trade-offs of various algorithmic solutions. Interface with other technical personnel or team members to finalize requirements. Work closely with other development team members to understand moderately complex product requirements and translate them into software designs. Successfully implement development processes, coding best practices, and code reviews for production environments. Perform other duties as needed.
Requirements
-
Master's degree (or foreign equivalent) in Data Science, Data Analytics, Statistics, Computer Science, or a related field required.
-
4 years of experience in job offered or related occupations required.
-
Also required is: 2 years of experience: working directly with large language models and transformer-based architectures including BERT, RoBERTa, and T5 to develop in depth understanding of the existing AI project architecture and to enhance product features by tuning the language models per need; applying LLMs including ChatGPT, GPT 3.5, Claude, and Mistral to understand the nuances of the existing large language models in the market and be able to build/enhance AI product features by utilizing the LLMs on a day-to-day basis, as these LLMs are widely used in developing our AI product; working with big data technologies and tools including Hadoop, Spark, or AWS to work with large volumes of data and perform Data Science related activities on the data viz., data analysis, feature extraction, model development, reporting etc.; working with machine learning algorithms, including deep learning, gradient boosting, and random forests to develop, fine-tune and deploy regression or classification Machine Learning models using these traditional ML algorithms; and in advanced programming skills in Python, R, or other relevant languages for data analysis to perform day-to-day activities like exploratory data analysis, model development, and testing & deployment.
-
Employee reports to LexisNexis USA office in Norwalk, CT, but may telecommute from any location within the U.S.
-
Experience can be concurrent.
SALARY RANGE FOR REQ# R114245