Data Scientist / Engineer
Role details
Job location
Tech stack
Job description
Agile Defense is seeking a Data Scientist / Engineer to support the design, development, and operational deployment of scalable, AI-enabled data solutions within the Department of Defense's CDAO ADA IR program. This role is part of a multidisciplinary team integrating advanced analytics, machine learning, and engineering practices into mission-critical environments at CYBERCOM.
You will help shape and deploy data pipelines, pre-processing workflows, feature engineering strategies, and machine learning services within secure, containerized environments. The ideal candidate brings a hybrid of statistical modeling fluency and hands-on software engineering expertise. You will collaborate closely with product managers, full-stack developers, platform engineers, and mission stakeholders to transform raw data into meaningful insights and decision-support tools. This role requires strong technical communication skills, a collaborative mindset, and experience working in agile environments that value reproducibility, testing, and continuous delivery. Familiarity with cloud-based data platforms such as Databricks, Palantir, or AWS-native data services is highly preferred.
Key Objectives
Objective 1: Design and Maintain Scalable Data Science Services · Plan, develop, and maintain reusable services for data ingestion, transformation, and feature engineering that support AI/ML workflows. · Implement core data science capabilities, such as entity resolution, classification, clustering, or prediction, within containerized environments that adhere to CI/CD, version control, and testing standards. · Collaborate with DevSecOps engineers to integrate services into secure production environments using tools like Databricks, Docker, and Terraform. · Ensure services meet performance, reliability, and security requirements consistent with DoD enterprise and cloud-native architecture. Objective 2: Build and Operationalize AI/ML Solutions · Develop and deploy standalone or embedded ML models for tasks such as decision support, automation, anomaly detection, and pattern recognition. · Select and implement appropriate modeling techniques using Python, Spark, or cloud-native ML frameworks (e.g., SageMaker, MLflow). · Maintain reproducibility and interpretability of model outputs to meet mission transparency and audit requirements. · Package model inference services with well-documented APIs for integration into end-user applications and operational dashboards. Objective 3: Perform Exploratory Data Analysis and Communicate Insights · Conduct exploratory data analysis (EDA) to identify trends, gaps, and opportunities within structured and unstructured datasets. · Develop data visualizations and interpretive summaries that support stakeholder understanding and product team decision-making. · Translate analytical findings into actionable recommendations using a mix of visual, narrative, and quantitative communication strategies. · Contribute to the team's shared library of analysis templates, reusable queries, and analytic workflows to accelerate future delivery. Objective 4: Collaborate Across Teams to Deliver Mission Impact · Engage with product managers and mission users to define data and model requirements aligned with operational goals. · Work closely with engineers to ensure data science components align with technical constraints and deployment patterns. · Participate in agile sprint planning, retrospectives, and demos, sharing progress and adjusting priorities based on feedback. · Maintain strong documentation practices that enable handoff, reproducibility, and technical accountability.
Requirements
· 4+ years of experience in applied data science, machine learning engineering, or data pipeline development. · Proficient in Python, SQL, and distributed data frameworks (e.g., Spark, Databricks, PySpark). · Experience developing ML models from training to deployment using industry-standard tools and libraries (e.g., scikit-learn, TensorFlow, XGBoost, MLflow). · Familiarity with MLOps, API development, and secure cloud-based environments (e.g., AWS, Azure, Palantir Foundry). · Strong understanding of data validation, model testing, and performance evaluation techniques. · Experience with data visualization and storytelling using tools such as Tableau, Plotly, or Matplotlib. · Excellent technical communication skills, with the ability to explain complex concepts to non-technical audiences.