Data Scientist III, Cancer Genomics Research Laboratory (CGR)

Frederick National Laboratory
Rockville, United States of America
23 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
$ 163K

Job location

Rockville, United States of America

Tech stack

JavaScript
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
JIRA
Bash
Big Data
Catalyst
Cloud Computing
Cloud Storage
Databases
Data Security
Data Stores
Web Development
File Systems
R
Interoperability
Python
PostgreSQL
Microsoft SQL Server
MySQL
Oracle Applications
Simple Data Format
SQL Databases
TypeScript
Google Cloud Platform
Data Storage Technologies
GIT
Containerization
Data Management
Software Version Control
Data Generation

Job description

The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world's most comprehensive cancer epidemiology research group. CGR is located at the NCI-Shady Grove campus in Rockville, MD and is operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70+ investigators in DCEG. CGR staff form a large multidisciplinary team of members comprised of laboratory staff, informaticists and data analysts, project managers, working in concert with epidemiologists, biostatisticians, and research scientists within the DCEG intramural research program. CGR focuses on generating high-quality data to support a wide range of sequencing and GWAS studies at the population level. These efforts produce large and complex bioinformatics and data analysis outputs with large volumes of multi-year data requiring careful curation and presentation to ensure long-term preservation and usability. The Data Scientist III will be responsible for the continuation, refinement, and creation of data curation procedures and methods needed to properly preserve the high-quality data and analyses generated within CGR/DCEG and bring data in line with the FAIR principles (Findable, Accessible, Interoperable, and Reusable). This will include the development and deployment of those procedures for new data generation, as well as the application of those procedures to key datasets. The successful candidate must demonstrate the technical and personal skills to successfully interact with CGR/DCEG staff with varied skillsets to understand the background of legacy data and how it is used as well as have the scientific and technical know-how to access and evaluate all data. We are seeking an enthusiastic and driven professional with a passion for understanding state-of-the-art genomic technologies used in cancer research and applying the knowledge for high-quality data curation practices. The candidate is required to collaborate and build on existing and establish new principles and procedures to protect that data via long-term curation while facilitating its usefulness and accessibility for research by CGR and its collaborators. If you have the desire for the understanding and preservation of high-quality cancer susceptibility data while making it more FAIR, then come and help enable that data provide additional impact and drive further understanding of cancer genetics., * Collaborate with CGR project managers, bioinformaticians, laboratory staff, DCEG investigators and data science experts to continue, refine, and develop data management procedures and controls for the management and curation of CGR-generated datasets as per the FAIR Principles.

  • Develop and refine best practices and lists of metadata needed for the archival and long-term curation of CGR/DCEG data.
  • Evaluate all datasets in CGR to effectively apply established curation procedures and allow for standardization, harmonization, and general usability of these data.
  • Evolve and continue development of a repository of CGR sample metadata to facilitate accessibility and reuse of data.
  • Collaborate with established data storage groups within NCI, FNLCR, and NIH to facilitate the archival and curation of datasets while ensuring proper storage and accessibility of the data is maintained.
  • Collaborate closely with DCEG PIs and CGR staff to facilitate and develop a culture in support of the FAIR principles to provide greater data accessibility and usability.
  • Coordinate as needed with staff at CGR and DCEG to support posting of publication-associated data to repositories in line with NIH Data Management and Sharing policies.

Requirements

To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below:

  • Possession of Bachelor's degree from an accredited college/university according to the Council for Higher Education Accreditation (CHEA) or four (4) years relevant experience in lieu of degree. Foreign degrees must be evaluated for U.S. equivalency.
  • In addition to the education requirement, a minimum of five (5) years of progressively responsible experience.
  • Team-oriented with excellent written and verbal communication skills, organizational skills, and strong attention to detail; ability to interface with many collaborators across multiple roles including project managers, PIs, and bioinformaticians.
  • Demonstrated ability in working with and understanding of genetic and epidemiological datasets and how they are used.
  • Proficiency in programming languages, such as Python, R, bash and SQL.
  • Demonstrated experience with a variety of systems (e.g. HPC, GPU, Cloud), platforms, and environments (e.g. native, Conda, containerization) to help facilitate data assessment and understanding of user data access.
  • Demonstrated experience with version control and code management systems such as Git.
  • Familiarity with data repositories and/or database systems such as MySQL, PostgreSQL, MS SQL Server, Oracle.
  • Ability to learn new topics quickly and apply the concepts to drive progress.
  • Independently organize meetings and track project progress over time.
  • Strong interpersonal skills for working in large teams with different backgrounds.
  • Strong written and presentation skills to summarize complex topics effectively and clearly.
  • Show flexibility in understanding the needs of the projects and adapt accordingly.
  • Familiarity of FAIR principles and related best practices.
  • Ability to obtain and maintain a security clearance.

PREFERRED QUALIFICATIONS

Candidates with these desired skills will be given preferential consideration:

  • Master's degree or PhD
  • Experience managing Omics datasets.
  • Experience managing and establishing FAIR based data curation procedures.
  • Experience harmonizing metadata of large datasets from multiple sources.
  • Experience navigating file systems to find and identify data and file metrics.
  • Familiarity with web application development languages such as TypeScript, JavaScript, etc.
  • Familiarity with cloud computing environments such as AWS and GCP, and cloud storage services such as AWS S3 and GCP GCS.
  • Familiarity with genomic data commons, such as NCI GDC, AnVIL, NHLBI BioData Catalyst data platform.
  • Familiarity of Omics data analysis procedures to establish an understanding of how the data is used and what is needed to perform analyses.
  • Familiarity with project management tools for documentation and communication (Jira, Teams, Slack, etc.) to document status and activities.
  • Familiarity with laboratory management systems.
  • Familiarity with differing storage platforms and performance tiers.
  • Familiarity with common bioinformatics tools and workflows for processing of GWAS and sequencing based genomic data. Understanding of the QC metrics, file formats and best practices for data management and archiving.

Benefits & conditions

Pay and benefits are fundamental to any career decision. That's why we craft compensation packages that reflect the importance of the work we do for our customers. Employment benefits include competitive compensation, Health and Wellness programs, Income Protection, Paid Leave and Retirement. More details are available here

113,500.00 - 162,533.00 USD

The posted pay range for this job is a general guideline and not a guarantee of compensation or salary. Additional factors considered in extending an offer include, but are not limited to, responsibilities of the job, education, experience, knowledge, skills, and abilities as well as internal equity, and alignment with market data.

The salary range posted is a full-time equivalent salary and will vary depending on scheduled hours for part time positions

About the company

The Frederick National Laboratory is operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human clinical trials, applications of nanotechnology in medicine, and rapid response to emerging threats of infectious diseases. Accountability, Compassion, Collaboration, Dedication, Integrity and Versatility; it's the FNL way., A rewarding career with global impact Whether you're an expert in your field or just starting out, we have a career opportunity for you. We're always looking for people to join us in fulfilling the mission of the Frederick National Laboratory: discovery, innovation, and success in the biomedical sciences. Our team of 2,400+ scientists, technicians, administrators, and support staff work at the forefront of basic, translational, and preclinical science, with a focus on cancer, AIDS, and other infectious diseases. We collaborate with colleagues across the National Cancer Institute, National Institutes of Allergy and Infectious Diseases, and others throughout the National Institutes of Health. We also engage with extramural investigators in academia, government and industry. Your path to joining our team begins with the desire to work for the only national laboratory dedicated to biomedical research. Our employees share a common desire to help make a difference in cancer research and public health concerns. As you search for a career that fits your education, skills, and abilities, explore the core values that guide us and emphasize work-life balance. Discover why joining the Frederick National Laboratory team could be the most important career step you take

Apply for this position