Data Scientist
Role details
Job location
Tech stack
Job description
The Associated Press' Metadata and Data Science Team seeks a Data Scientist based in New York, NY.
Why this role matters:
The Data Scientist will design and implement data science and applied machine learning solutions supporting new product development, search and discovery on platform, content enrichment and metadata generation. As a member of cross-functional project teams, the Data Scientist will perform data analysis, evaluate commercial and open-source models, and deliver solutions with real-world impact.
The team works closely with various departments and functions across the organization to design, implement and manage end-to-end content metadata, to maintain the integrity of schema standards, and to build solutions with data, analytics and machine learning methods.
What you will do:
-
Evaluate, fine-tune, and maintain statistical and machine learning models used in run-time production environments, measuring and communicating performance improvements to stakeholders
-
Partner with cross-functional teams to design and optimize AI/ML solutions that deliver new product capabilities and internal workflow improvements, using news articles, photos, videos, election results, and other news data
-
Research, evaluate, and recommend models and methodologies across the AI/ML landscape, presenting recommended solutions to technical and non-technical stakeholders
-
Identify and address gaps in model quality and performance metrics, synthesizing findings into clear, actionable recommendations
-
Contribute to the design and enhancement of data and ML pipelines, including multimodal embedding generation and knowledge extraction, with a focus on accuracy, efficiency, and scalability
-
Design user-centered solutions and search algorithms focused on quality and performance
-
Stay current with emerging technologies and advances in NLP, machine learning, and data science, proactively surfacing opportunities for improvement
-
Support the full model development lifecycle, from problem definition and prototyping through integration, deployment, monitoring, and iteration
-
Communicate analysis and present findings clearly, adapting to a range of technical and business audiences
Requirements
Do you have experience in Data-driven problem-solving?, Do you have a Bachelor's degree?, * 3+ years of relevant data science experience, with strong proficiency in Python including NumPy, Pandas, and large-scale semi-structured JSON data
-
Bachelor's degree in Data Science or Computer Science
-
Experienced applying core machine learning methods including classification, clustering, regression, and ranking
-
Hands-on experience with NLP techniques such as entity recognition, disambiguation, semantic similarity, and embedding-based retrieval
-
Experience with transformer models for structured extraction, classification, summarization, and generation
-
Experience with hybrid search algorithms, retrieval pipelines, intent detection, query expansion, and relevance tuning in Elasticsearch or OpenSearch
-
Experience working with both language and multimodal models
-
Experience and comfort working with real-world data, including text and visuals, at scale
-
Familiar with ML engineering and ML Ops practices, with a track record of delivering runtime solutions
-
Familiarity with cohort analysis, session segmentation, A/B testing, and confidence calibration
-
Analytical and curious, with strong problem-solving skills and a practical focus on high-impact, cost-aware solutions
-
Able to effectively manage multiple project deliverables simultaneously
-
Comfortable being accountable for deliverables across the full product development lifecycle, from problem definition through launch and iteration
-
An effective communicator who can tailor analysis and presentations to both technical and non-technical audiences
-
Collaborative and empathetic, with a genuine focus on user impact and a desire to grow data literacy across the organization
-
Advanced-level professional competency in written and spoken English
-
Authorization to work in the United States for any employer
What will set you apart:
-
Experience in news media or working with news as data strongly preferred
-
Master's degree in Data Science or a related field
-
Familiarity with graph data models and designing entity-relationship schemas
-
Eagerness to learn the technical nuances of large-scale media operations and identify opportunities within evolving systems
Benefits & conditions
4.24.2 out of 5 stars New York, NY 10281 Hybrid work $116,000 - $160,000 a year, Pulled from the full job description
-
Paid parental leave
-
Parental leave
-
Health insurance
-
Retirement plan
-
Vision insurance
-
Dental insurance
-
Life insurance, The anticipated salary range for this position is $116,000 - $160,000, based on a candidate's skills, qualifications, and location. The Associated Press offers comprehensive benefits, which include:
-
Competitive medical, dental and vision coverage
-
Retirement benefits
-
Company paid life insurance
-
Paid vacation and sick days
-
Paid parental leave for any new parent
-
Mental well-being resources