Data Scientist II

Scribd Inc.
Dallas, United States of America
6 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 123K

Job location

Remote
Dallas, United States of America

Tech stack

Artificial Intelligence
Artificial Neural Networks
Distributed Computing Environment
Information Retrieval
Interaction Design
Python
Machine Learning
Language Modeling
Natural Language Processing
Named Entity Recognition
NumPy
SQL Databases
PyTorch
Large Language Models
Spark
Deep Learning
Generative AI
PySpark
Scikit Learn
Information Technology
Machine Learning Operations
Databricks

Job description

We believe the best work happens when individual flexibility is balanced with meaningful community connection. Scribd Flex empowers employees to choose the workstyle and location that support their best performance, while committing to intentional in-person moments that strengthen collaboration and culture. Occasional in-person attendance is required for all Scribd, Inc. employees, regardless of location.

So what are we looking for in new team members? At Scribd, Inc., we hire for "GRIT." Traditionally defined as the intersection of passion and perseverance toward long-term goals, GRIT reflects the mindset we expect from every employee. For us, it also serves as a practical framework for how we work: setting and achieving Goals, delivering Results within your role, contributing Innovative ideas and solutions, and strengthening the broader Team through collaboration and attitude.

This posting reflects an approved, open position within the organization.

About the team

The Applied Research team is a group of data scientists and content specialists who are experts in leveraging machine learning, natural language processing and generative AI models to develop solutions which deliver value to our users and business.

We act as a key driver for innovation, whether it's in product surface experimentation, metadata generation or model development. Along with Product and Engineering partners, we design solutions and collaborate in cross-functional squads to maximize business impact.

Our areas of impact include content enrichment, representation learning, recommendations, search, translation and many others, applied to diverse media across text, image, and audio. We operate at a scale of hundreds of millions of documents, millions of users and billions of user interactions., * Focus on a variety of content classification use cases, leveraging everything from traditional NLP to sophisticated LLMs and generative models

  • Investigate methods of solving our most challenging problems at Scribd, at scale
  • Collaborate with other Data Scientists, Machine Learning Engineers and ML Data Engineers on cross-functional projects
  • Leverage any algorithm at your disposal: from classical Scikit-learn and NumPy models to custom Neural Networks in PyTorch to third party LLM APIs
  • Process massive amounts of data with Python, SQL and Spark
  • Align with stakeholders through written and verbal communications methods on the approaches and results of projects, while writing detailed, accurate and concise project documentation

Requirements

We are seeking a Data Scientist II with experience developing and deploying machine learning models. You will help design and implement high impact AI and ML systems. We work in cross-functional teams collaborating with Machine Learning Engineers, Data Engineers and Product. We are seeking a curious and collaborative individual with an eye for simplicity, end-end visibility and impact and that is excited about building models using massive amounts of data, using language models and deploying models., * 3+ years of post qualification experience developing machine learning models, working with systems at scale and deploying to production environments.

  • Proficiency in Python.
  • Hands-on experience building ML pipelines and working with distributed data processing frameworks like Apache Spark, Databricks, or similar.
  • Intermediate level in at least three of these fields: classification algorithms, natural language processing, search, information retrieval, named entity recognition, deep learning, generative models.
  • Intermediate level or greater experience with SQL or PySpark.
  • Bachelors or Masters in relevant quantitative discipline including but not limited to Statistics, Computer Science, Data Science, Artificial Intelligence or another field with a strong quantitative focus.

Benefits & conditions

Paid parental leave, Parental leave, Health insurance, Paid time off, Vision insurance, Dental insurance, Disability insurance, At Scribd, your base pay is one part of your total compensation package and is determined within a range. Our pay ranges are based on the local cost of labor benchmarks for each specific role, level, and geographic location. San Francisco is our highest geographic market in the United States. In the state of California, the reasonably expected salary range is between $118,000 [minimum salary in our lowest geographic market within California] to $184,000 [maximum salary in our highest geographic market within California].

In the United States, outside of California, the reasonably expected salary range is between $97,000 [minimum salary in our lowest US geographic market outside of California] to $175,000 [maximum salary in our highest US geographic market outside of California].

In Canada, the reasonably expected salary range is between $123,000 CAD[minimum salary in our lowest geographic market] to $164,000 CAD[maximum salary in our highest geographic market].

We carefully consider a wide range of factors when determining compensation, including but not limited to experience; job-related skill sets; relevant education or training; and other business and organizational needs. The salary range listed is for the level at which this job has been scoped. In the event that you are considered for a different level, a higher or lower pay range would apply. This position is also eligible for a competitive equity ownership, and a comprehensive and generous benefits package., * Scribd Flex (flexible work model)

  • Comprehensive health, dental, and vision coverage
  • Mental health support and disability coverage
  • Generous paid time off, including vacation, sick time, holidays, winter break, volunteer time, and sabbaticals
  • Paid parental leave and family support benefits
  • Retirement matching and employee equity
  • Learning and development programs and professional growth opportunities
  • Wellness and home office stipends
  • Complimentary access to the Scribd, Inc. suite of products
  • Enterprise access to leading AI tools

About the company

Scribd, Inc. is on a mission to advance human understanding. Our four products - Scribd, Slideshare, Everand, and Fable - help billions of people across the globe move beyond access and into insight, application, and expertise.

Apply for this position