Data Scientist

Groundwork Renewables
Albuquerque, United States of America
13 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 80K

Job location

Albuquerque, United States of America

Tech stack

API
Artificial Intelligence
Amazon Web Services (AWS)
Audit Trail
Azure
Information Systems
Databases
Continuous Integration
Data Auditing
Data Dictionary
Information Engineering
Data Governance
Data Infrastructure
Data Integrity
ETL
Data Security
Data Stores
Data Systems
Relational Databases
Cursor (Graphical User Interface Elements)
Software Debugging
Web Development
Python
Laboratory Information Management Systems
PostgreSQL
MySQL
NoSQL
Cloud Services
Service Development Studio
Software Engineering
SQL Databases
SQLAlchemy
Technical Data Management Systems
Web Applications
Scripting (Bash/Python/Go/Ruby)
GitHub Copilot
React
System Availability
Backend
GIT
FastAPI
Pandas
Build Management
Containerization
Information Technology
Data Lineage
Maintaining Code
Plotly
Data Management
Streamlit Framework
Software Version Control
Data Pipelines
Docker

Job description

GroundWork seeks a Data Scientist to design, build, and maintain the data infrastructure, access software, and web applications that make lab and field measurement data reliably available to our parent company and internal stakeholders. This role sits at the intersection of database engineering, full-stack software development, data quality assurance, and AI-assisted tooling, ensuring that high-integrity datasets are collected, managed, and surfaced through modern, scalable systems. The ideal candidate combines strong technical skills in database management and software development with familiarity with solar energy measurement, accredited laboratory environments, and regulatory data standards.

As a Data Scientist, you will work closely with GroundWork's engineering, laboratory, and operations teams to prioritize, design, and deliver robust data systems that meet the needs of both internal users and our parent company. You will leverage AI-assisted development tools to accelerate the delivery of web applications and data pipelines, while maintaining the rigor and traceability required in a laboratory and regulatory context. This role requires a technically versatile individual who can work across disciplines to drive data reliability, accessibility, and operational excellence.

Key Responsibilties

  • Technical Subject Matter Expertise: Design, implement, and maintain relational and time-series databases for lab instrument data, environmental measurements, and operational records. Develop and manage ETL/ELT pipelines to ingest, transform, and store data from IoT sensors, measurement hardware, and remote sensing platforms. Build and deploy data access APIs and web applications using modern tools (e.g., Streamlit, FastAPI, React, or similar frameworks) to enable parent company analysts and stakeholders to query, visualize, and export lab data. Apply AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude) to accelerate software delivery while maintaining code quality and auditability appropriate for a laboratory environment.

  • Data Quality Assurance & Control: Develop and enforce QA/QC protocols to validate incoming data from lab instruments and field sensors in accordance with applicable regulatory and accreditation standards (e.g., ISO 17025 or similar). Implement automated checks, flagging routines, statistical validation, and audit trails to detect anomalies, missing data, and calibration drift. Maintain defensible data records that satisfy chain-of-custody and traceability requirements. Ensure data integrity from acquisition through delivery to downstream consumers.

  • Database Architecture & Optimization: Architect and optimize database schemas for performance, scalability, and ease of access. Evaluate and recommend appropriate database technologies (SQL, NoSQL, time-series) based on data volume, query patterns, and reporting requirements of the lab and parent company.

  • Stakeholder Collaboration: Partner with lab scientists, operations, and parent company data teams to understand data access requirements and translate them into technical solutions. Serve as the primary point of contact for data availability and reporting needs.

  • Web Application Development: Design and build web-based data access tools, dashboards, and reporting interfaces using modern full-stack frameworks (e.g., React, FastAPI, Streamlit, Plotly Dash). Leverage AI-assisted development environments (e.g., GitHub Copilot, Cursor, Claude Code, or similar) to accelerate development cycles while ensuring maintainability, security, and compliance with lab data governance requirements. Enable non-technical users at the parent company to explore, filter, and export lab datasets through intuitive interfaces without requiring direct database access.

  • Data Governance & Documentation: Maintain comprehensive data dictionaries, schema documentation, and data lineage records consistent with laboratory quality management systems. Contribute to laboratory SOPs and data management plans. Stay current with emerging data engineering technologies, AI tooling, and laboratory informatics practices to continuously improve the lab's data infrastructure.

Requirements

Do you have experience in Laboratory experience?, Do you have a Bachelor's degree?, * Experience: Minimum of 2 years of experience in database engineering, data software development, or a related technical discipline, preferably in a laboratory, scientific, or renewable energy context. Experience in photovoltaic (PV) testing, solar energy measurement, or a physical laboratory environment is highly preferred.

  • Education: Bachelor's degree in computer science, software engineering, information systems, data science, or a related field; advanced degree or relevant certifications preferred.

  • Skills:

  • Proficiency in SQL and experience with relational databases (PostgreSQL, MySQL, or similar); familiarity with time-series or NoSQL databases a plus.

  • Proficiency in Python (pandas, SQLAlchemy, FastAPI, or similar) for data engineering, scripting, and backend service development.

  • Experience building web applications or data dashboards using tools such as Streamlit, Dash, FastAPI, React, or modern AI-assisted development environments (e.g., GitHub Copilot, Cursor, Claude Code); ability to deliver functional, user-facing tools rapidly using AI pair-programming workflows.

  • Experience implementing QA/QC workflows for scientific or sensor data, including anomaly detection, validation rules, statistical flagging, and audit logging; familiarity with laboratory quality management standards (e.g., ISO 17025, GLP, or similar regulatory frameworks) is a strong plus.

  • Excellent communication skills; ability to translate complex technical data concepts for non-technical stakeholders including lab scientists and business analysts.

  • Familiarity with version control (Git), CI/CD practices, and cloud data platforms (AWS, Azure, or GCP); experience with containerization (Docker) is a plus.

  • Demonstrated experience using AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code, or similar) to write, debug, and refactor code; comfort evaluating AI-generated outputs for correctness, security, and suitability in a regulated laboratory data environment.

  • Understanding of laboratory informatics concepts and data management in accredited or regulated settings; experience with LIMS (Laboratory Information Management Systems) or similar platforms is a plus.

Approach to Work

  • Aligns with our values: Trustworthy, Caring, Knowledgeable, Trailblazing, Nimble and Meticulous.

  • Works collaboratively and directly with remote multi-functional teams and clients.

  • Presents a positive, 'can-do' attitude while working in a multi-project work environment.

  • Self-motivated, punctual, organized, and able to perform work with limited supervision.

  • Able to solve practical problems and deal with a variety of concrete variables in situations where only limited standardization exists.

  • Able to communicate verbally and in writing in a clear, concise, and professional manner.

Benefits & conditions

Pulled from the full job description

  • Professional development assistance
  • Parental leave
  • 401(k)
  • Health insurance
  • Paid time off
  • Vision insurance
  • Health savings account, * 401(k)
  • Dental insurance
  • Flexible spending account
  • Health insurance
  • Health savings account
  • Life insurance
  • Paid time off
  • Parental leave
  • Professional development assistance
  • Vision insurance

About the company

GroundWork Renewables is the solar industry's trusted full-stack performance partner. A Certified B Corporation and ISO-accredited testing provider, we deliver precise MET data and PV module insights-helping developers, EPCs, and asset owners reduce risk, improve forecasting, and maximize value throughout the project lifecycle. Our services have enabled 1,000+ solar measurement campaigns, helping project developers secure billions in financing by reducing uncertainty with trusted resource data.

Apply for this position