Data Scientist
Role details
Job location
Tech stack
Job description
The Data Scientist makes predictive insights and builds decision-support tools from operational and performance data, combining hands-on data science (statistical modeling and machine learning) with ownership of industrial analytics stack. The Data Scientist serves as a PI System Administrator, Seeq Developer, and builder of AI-enabled tools that help engineers and operators act faster and with confidence.
The Data Scientist will work with industrial data systems-especially the PI System (OSIsoft PI / AVEVA PI)-to ensure reliable data availability and governed context and will develop Seeq analyses and Python-based pipelines for time-series modeling, anomaly detection, forecasting, and performance monitoring.With curiosity, rigor and comfort with large time-series datasets the Data Scientist is able to take solutions end-to-end-from PI tag and context configuration, to Seeq development, to Python/AI tool deployment-while collaborating with cross-functional technical teams.
Essential Functions
- Data Science, Modeling, and Insights
-
- Build statistical and machine-learning models to detect anomalies, forecast performance, and identify optimization opportunities
-
- Design and evaluate experiments and model validation approaches; translate results into clear recommendations for engineering and operations
-
- Develop dashboards, reports, and model performance metrics to communicate insights and drive data-informed decisions
- PI System Administration & Time-Series Data Engineering
- Administer and support the PI System (OSIsoft PI / AVEVA PI), including tag strategy, data quality monitoring, and user support
- Build and maintain PI AF structure (assets, templates, attributes) and documentation to provide governed context for analytics and reporting
- Support PI interfaces/data flows and collaborate with OT/IT and engineers to validate sensors/tags, troubleshoot gaps, and improve reliability and performance
- Create curated datasets, features, and labels from PI data (with clear definitions and lineage) to support Seeq analyses and ML modeling
- Seeq Development & AI-Enabled Tools
- Develop and maintain Seeq Workbooks/Analyses for performance monitoring, anomaly detection, and root-cause investigations
- Create reusable Seeq templates, calculation standards, and best practices; enable users through documentation and training
- Build AI-enabled tools (e.g., copilots, guided diagnostics, automated summaries) that leverage governed PI/Seeq context to accelerate engineering workflows
- Evaluate, monitor, and improve AI tool quality (accuracy, drift, user feedback), and implement practical guardrails for safe, reliable use
- Python, Analytics Engineering & Deployment
- Develop and maintain Python-based pipelines for data extraction, preprocessing, modeling, and automation
- Prototype and productionize analytical applications that support performance monitoring, anomaly detection, and forecasting
- Automate recurring model runs, evaluations, and reporting workflows with attention to reproducibility and reliability
- Improve existing analytics codebases; contribute to model monitoring, documentation, and maintainable data science practices
- Project & Engineering Partnership
- Collaborate with engineers and subject matter experts to frame operational problems into measurable data science objectives
- Provide analytical support for initiatives including data validation, statistical analysis, modeling, and performance reporting
- Help standardize modeling approaches, feature definitions, and evaluation metrics across projects
- Data Quality, Governance & Monitoring
- Ensure accuracy and reliability of datasets used for analysis and modeling (validation checks, outlier handling, sensor sanity checks)
- Perform data cleaning, validation, and documentation, including assumptions, feature definitions, and dataset lineage
- Maintain organized analytical workflows and pipelines to support repeatable modeling and ongoing monitoring
Other Responsibilities
- Other duties and projects as assigned by management
Requirements
-
Bachelor's degree in Data Science, Computer Science, Engineering, Statistics, Applied Math, or related field
-
2-5 years of experience in data science, applied analytics, or technical modeling roles
-
Strong Python skills for data science (e.g., pandas, numpy, scikit-learn; visualization libraries)
-
Strong skills in SQL and Excel for analysis, validation, and stakeholder-ready outputs
-
Experience with data visualization and reporting tools (Power BI, Tableau, or similar)
-
Strong statistical reasoning, analytical problem-solving skills, and attention to data quality
-
Ability to communicate technical findings clearly to both technical and non-technical stakeholders
-
Demonstrated experience using applied statistics, machine learning, and time-series modeling
-
Ability to use Python for data science and AI tooling (data wrangling, modeling, visualization; building assistants/automation)
-
Understand and apply PI System administration fundamentals and data engineering for high-frequency time-series (tags, quality checks, contextualization)
-
Seeq development (shared analyses, calculations, templates) and stakeholder-ready data storytelling
-
Ability to partner cross-functionally and tool and team enablement (requirements, training, documentation, adoption)
-
Lead and support deployment-minded practices (reproducibility, versioning, testing, monitoring) for analytics, models, and AI tools
-
Must have strong verbal and written communication skills
-
Must be able to read, write and speak English at a level which will permit the employee to accurately understand and communicate information to safely and efficiently perform the job duties
-
Ability to prioritize and plan work activities so time is used efficiently and effectively
-
Must demonstrate accuracy and thoroughness to ensure quality performance
-
Ability to identify and resolve problems in a timely manner, + Experience working with OSIsoft PI / AVEVA PI System (PI Data Archive, PI AF) and industrial time-series data
- Experience developing in Seeq (Workbench/Organizer), including building shared analyses and calculations
- Experience with operational/engineering datasets (e.g., power generation, rotating equipment, process systems)
- Familiarity with time-series methods (e.g., resampling, lag features, seasonality, change-point detection)
- Experience developing reusable analytics packages, APIs, or scheduled jobs for model execution
- Knowledge of predictive modeling (forecasting, classification/regression), anomaly detection, and model evaluation
Physical Requirements
-
The ability to work in an office environment and to work at a computer, and computer monitor, and use repetitive motion for long periods of time
-
The ability to periodically lift up to 15 lbs