RWD Data Analyst
Role details
Job location
Tech stack
Job description
Norstella's Real-World Data (RWD) team is seeking a mid-level Data Analyst with demonstrated expertise in real-world healthcare data. This role is designed for an analytically rigorous professional who applies critical thinking to each data engagement - evaluating source suitability, informing methodological decisions, and delivering outputs that reflect both technical precision and clinical reasoning.
The successful candidate will possess a thorough understanding of RWD source types and their respective limitations, and will be expected to make substantive, well-supported recommendations on data selection, inclusion and exclusion criteria, and analytical approach. This individual will serve as a trusted technical resource for internal teams and client-facing stakeholders, translating complex requirements into sound data strategies while proactively identifying risks and driving issues to resolution.
While the role requires strong technical proficiency, candidates are expected to bring a consultative orientation to their work - engaging with client needs, contributing to product decisions, and consistently seeking to improve outcomes rather than simply executing against defined specifications.
Key Responsibilities
RWD Expertise & Data Advisory
-
Serve as a subject matter expert across real-world data source types, including claims, EHR/EMR, laboratory, and NLP-derived data, with a thorough understanding of the strengths, limitations, and appropriate applications of each
-
Evaluate client and project requirements to formulate clear, well-reasoned recommendations regarding data source selection, variable construction, and inclusion and exclusion criteria
-
Advise on vendor data structures and delivery formats, translating source-specific characteristics into actionable guidance for project and delivery teams
-
Contribute domain expertise to internal product development discussions, identifying opportunities to enhance data offerings in response to evolving industry and client needs
Critical Problem Solving & Analytical Decision Making
-
Apply structured analytical reasoning to each data engagement, critically evaluating requirements, challenging assumptions where appropriate, and recommending the most defensible analytical path forward
-
Investigate complex and unstructured data sources - including raw EMR data - to assess fitness for purpose and identify viable analytical signal
-
Anticipate downstream data quality and delivery risks, proactively surfacing issues and proposing solutions prior to escalation
-
Leverage AI-powered tools (e.g., ChatGPT, Claude) to enhance development efficiency, support artifact generation, and address non-standard analytical challenges
Data Extract Development & Implementation
-
Design and develop client-specific data extracts that reflect both the stated specification and a sound, defensible understanding of the underlying data
-
Translate real-world data requirements into production-quality SQL logic, with attention to accuracy, reproducibility, and long-term maintainability
-
Develop and maintain database queries and procedures used to extract and structure RWD across multiple data sources
Client & Stakeholder Engagement
-
Partner with client-facing teams to develop a thorough understanding of client intent, engaging constructively when data limitations require an alternative approach
-
Communicate data decisions, methodological trade-offs, and analytical assumptions clearly to both technical and non-technical audiences
-
Manage multiple concurrent project workstreams, maintaining quality and responsiveness across competing priorities
Collaboration & Knowledge Sharing
-
Collaborate with analytics, engineering, and delivery teams to ensure shared understanding of data logic, source assumptions, and extract methodology
-
Contribute to shared documentation, standard operating procedures, and team knowledge resources
-
Participate in team discussions related to data quality, process improvement, and product direction
-
All other duties as assigned
Requirements
-
Bachelor's degree in Data Science, Computer Science, Public Health, Biostatistics, Epidemiology, or a related field
-
2-5 years of hands-on experience with real-world healthcare data, with demonstrated ability to assess source suitability and formulate data-driven recommendations
-
Comprehensive knowledge of RWD source types - including claims, EHR/EMR, laboratory, and unstructured data - and the ability to articulate appropriate use cases for each
-
Prior experience investigating unstructured or semi-structured EMR data, including identification of analytically viable variables within complex source formats
-
Strong proficiency in SQL, including complex multi-table query development; ability to produce and review production-quality extract logic
-
Demonstrated capacity to formulate and defend analytical decisions - including data selection, exclusion criteria, and edge case handling - grounded in both technical and clinical reasoning
-
Proficiency with AI-powered development tools (e.g., ChatGPT, Claude), with demonstrated ability to apply these tools to accelerate delivery and address complex analytical challenges
-
Familiarity with the life sciences industry and the drug development lifecycle
-
Experience with version control systems in a collaborative development environment
-
Strong analytical and critical thinking skills, with a demonstrated track record of proactive issue identification and resolution
-
Ability to manage multiple concurrent projects while sustaining high standards of quality, accuracy, and stakeholder responsiveness
Preferred Skills
-
Experience working with vendor-delivered data products, including familiarity with delivery formats, data dictionaries, and source-specific quality considerations
-
Exposure to RWD product development, including participation in decisions regarding data asset design or feature prioritization
-
Experience translating clinical or scientific documentation into executable database logic
-
Proficiency with data analysis and pipeline tools such as Python, R, Airflow, or equivalent
-
Comfort operating within governed, production-oriented data environments
-
Experience supporting client-facing or downstream data delivery workflows