Data Scientist

ICT MONDIAL INC
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Tech stack

HTML
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Cloud Computing
Cloudera Impala
Nvidia CUDA
Databases
Information Engineering
Data Systems
IBM DB2
Database Queries
DevOps
Markup Languages
Hadoop
Hive
Information Sciences
Python
Latex
PostgreSQL
Machine Learning
Mathematica
Microsoft SQL Server
MySQL
Natural Language Processing
NLTK
NumPy
Oracle Applications
TensorFlow
Azure
Search Technologies
SQLite
SQL Databases
Transaction Data
Data Processing
PyTorch
Large Language Models
Spark
Generative AI
GIT
Pandas
Matplotlib
Scikit Learn
Kubernetes
Information Technology
Machine Learning Operations
Spacy
Software Version Control
Data Pipelines

Job description

  • Apply hands-on experience in Python, NLP frameworks, SQL, Pandas, NLTK, and spaCy to solve real-world data challenges

  • Analyze trends and transactional data using strong SQL skills

  • Develop, test, and deploy new techniques for NLP understanding

  • Build scalable ML and Generative AI solutions, including Large Language Models (LLMs)

  • Train and optimize NLP/LLM models and build Python-based data pipelines

  • Build cloud-native solutions on AWS

  • Determine the nature of analytic problems, evaluate options, and recommend resolutions

  • Advise on methods and data needed to evaluate complex data problems

  • Collaborate with data collectors and analysts to close gaps on complex monitoring problems

  • Deliver accurate, timely, and sophisticated data analysis

Requirements

We are seeking a Senior Data Scientist with deep, hands-on expertise in Natural Language Processing (NLP) and Generative AI/LLMs to support a federal data science initiative. The ideal candidate is a true self-starter who can operate independently, translate complex analytic problems into automated data solutions, and communicate findings clearly to both technical teams and executive leadership., * Bachelor''s degree in Statistics, Applied Mathematics, Computer Science, or Information Science, with industry experience in Python, NLP frameworks, SQL, Pandas, NLTK, spaCy, data science, and AI/ML/LLM engineering

  • 10+ years overall IT industry experience

  • Education/experience combinations accepted: Master''s + 10 years; Bachelor''s + 12 years; or 18 years in lieu of a degree

Required Skills

  • Solid experience with NLP, Python, NLP frameworks, SQL, Pandas, NLTK, and spaCy

  • Experience with Generative AI and LLMs

  • Demonstrated self-starter, able to operate independently

  • Fluency in Python, version control/Git, standard Python packages (Pandas, NumPy, Matplotlib), and ML frameworks

  • Knowledge of TensorFlow, PyTorch, Pandas, scikit-learn, NLTK, AWS EC2 (Azure ML a plus)

  • Experience with scalable data engineering frameworks (e.g., Apache Spark) and orchestration frameworks (e.g., Airflow), and/or semantic search

  • Expert-level data analysis and advanced statistical/ML methods to build, train, test, and evaluate supervised and unsupervised models

  • Experience with ML model deployment and operations (DevOps, MLOps, LLMOps)

  • Experience with NLP/Generative AI libraries (e.g., spaCy, LangChain), text annotation tools, and semantic frameworks

  • Ability to clean and process large volumes of real-world data

  • Experience retrieving/manipulating data from varied sources (DB2, Oracle, SQL Server, Hadoop, flat files)

  • Experience with database management systems (PostgreSQL, MySQL, SQLite, SQL, etc.)

  • Excellent analytical and problem-solving skills; ability to identify risks and propose solutions

  • Excellent written and verbal communication skills across audiences, including executive leadership

Desired Skills

  • Prior experience on federal or state government IT projects

  • Industry experience strongly preferred

  • Experience with, or willingness to learn, the Hadoop ecosystem (Spark, Impala, Hive)

  • Experience in an analytical research environment

  • Experience in parallel/GPU processing (CUDA)

  • Experience with Mathematica

  • Experience with markup languages (LaTeX, HTML)

  • Experience with NLP for anomaly detection

Apply for this position