Data Scientist

Anonymous Employer
McLean, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

McLean, United States of America

Tech stack

Amazon Web Services (AWS)
Apache HTTP Server
Bash
Big Data
Cloud Database
Computer Programming
Information Engineering
ETL
Data Systems
Data Visualization
Relational Databases
Database Queries
Distributed Data Store
Elasticsearch
Python
PostgreSQL
Linux System Administration
Machine Learning
MySQL
Natural Language Processing
Neo4j
SQL Databases
Tableau
Technical Data Management Systems
Jupyter Notebook
Data Processing
Data Ingestion
Spark
Indexer
GIT
Amazon Web Services (AWS)
Containerization
Kubernetes
Information Technology
Apache Nifi
Machine Learning Operations
Kibana
Software Version Control
Data Pipelines
Docker

Job description

Design, build, and maintain scalable data pipelines and infrastructure supporting analytics, reporting, and machine learning use cases Develop and optimize ETL and ELT workflows for structured and unstructured data sources Build and maintain data integration layers across cloud, web, and on-premise systems Ensure data quality, consistency, security, and reliability across data pipelines and storage systems Develop data processing solutions using Python, SQL, and Bash scripting in Linux environments Construct and optimize complex queries across multiple data sources (e.g., PostgreSQL, MySQL, Neo4j, RDS) Develop and manage ingestion pipelines using tools such as Apache NiFi Process and transform large-scale datasets from diverse structured and unstructured sources Develop reusable, tested, and reproducible data workflows and Python-based modules Use Elasticsearch and Kibana for search, indexing, and data visualization use cases Document technical solutions, data pipelines, and methodologies for both technical and non-technical stakeholders Communicate findings through written reports, dashboards, and oral briefings to stakeholders Collaborate across multiple teams to support data-driven decision-making and analytics initiatives Support knowledge sharing by explaining complex data concepts to junior team members

Requirements

The ideal candidate has strong programming skills, experience working across distributed data systems, and the ability to translate complex technical data problems into reusable, production-ready solutions for stakeholders., Strong experience in data engineering and data pipeline development, including ETL/ELT design and implementation Proficiency in Python programming for data processing and automation Strong experience with SQL and relational database systems Experience working in Linux environments with advanced Bash scripting Experience building and managing data pipelines using Apache NiFi or similar tools Experience processing both structured and unstructured data sources Experience working with Elasticsearch and Kibana Experience using Git-based version control systems Experience using Jupyter Notebooks for analysis and prototyping Experience delivering technical results through documentation and stakeholder briefings Strong communication skills and experience working with multiple stakeholders Experience creating reusable, tested, and maintainable data solutions Academic or professional background in math, statistics, physics, computer science, data science, or related fields, Experience with cloud platforms such as AWS and cloud-based data architectures Experience with big data processing frameworks such as Apache Spark or Trino Experience applying machine learning algorithms and NLP techniques Experience with containerization technologies such as Docker or Kubernetes Experience with data visualization tools such as Tableau, Kibana, or Apache Superset Experience working with or designing machine learning workflows and models Experience creating training materials or technical curriculum in data or scientific domains Familiarity with data science MLOps or production ML workflows

Benefits & conditions

What we offer: Flexible time off Full medical coverage 401(k) with company match Referral bonuses Performance bonuses Life insurance and disability coverage Tuition and training reimbursement

Apply for this position