Senior Applied Data Scientist - Scalable ML Systems
Role details
Job location
Tech stack
Job description
As a Senior Applied Data Scientist, you'll operate at the intersection of data engineering, data science, and machine learning. You'll design and implement large-scale data architectures, develop robust data pipelines, and build high-quality ML models that integrate simulation and measurement data from diverse domains.
Your work will directly influence Keysight's advanced R&D initiatives, from algorithm development to AI-assisted engineering tools., * Partner with internal engineering and data teams to identify key data sources, define feature requirements, and align data standards across organizations.
- Design, implement, and maintain data lakes, databases, and ETL/ELT pipelines (Snowflake, Databricks, SQL, Python).
- Integrate, clean, and align simulation, measurement, and operational data for scalable AI/ML model development.
- Conduct exploratory data analysis, dimensionality reduction (e.g., PCA), clustering, and regression to extract insights.
- Develop and validate ML models using tree-based methods (XGBoost, LightGBM, Random Forests) and Bayesian Optimization for tuning.
- Apply signal processing and data augmentation techniques to improve data quality and coverage.
- Document data lineage, feature definitions, and modeling rationale for reproducibility and transparency.
- Communicate insights and recommendations to stakeholders, influencing data-driven decisions across R&D and product teams.
Requirements
- Master's or PhD in Data Science, Computer Science, Electrical Engineering, Statistics, or related field.
- 5+ years' experience as a Data Scientist / Applied Data Scientist, ideally in engineering or simulation-driven environments.
- Proven ability to build and maintain scalable data infrastructures (data lakes, schemas, pipelines).
- Strong programming skills in Python (pandas, numpy, scikit-learn), SQL, and optionally C++.
- Proficiency with Snowflake, Databricks, or similar big-data environments.
- Hands-on expertise in tree-based ML techniques and statistical modeling.
- Familiarity with Bayesian Optimization and feature engineering for time-series or signal data.
- Ability to move fluidly between data exploration, engineering, and modeling tasks., * Experience in data architecture design, schema governance, or cross-team data standards.
- Familiarity with Keysight simulation or measurement tools (e.g., ADS, RFPro, EMPro, Signal Studio, RaySim).
- Knowledge of MLOps principles for productionizing models and maintaining pipelines.
- Experience with metadata management and feature store design.
- Prior exposure to environments combining simulation and real-world measurement data.