AI Researcher (AI-Oriented Knowledge Systems)
Genscript Corporation
Piscataway, United States of America
23 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Intermediate Compensation
$ 145KJob location
Piscataway, United States of America
Tech stack
Artificial Intelligence
Big Data
Encodings
Computer Programming
Databases
Data Cleansing
Data Deduplication
Data Governance
Data Integration
Text Processing
Graph Database
Information Management
JSON
Python
Knowledge-Based Systems
Neo4j
NLTK
Web Ontology Language
Resource Description Framework (RDF)
Search Technologies
Software Deployment
Solution Deployment Descriptor
Web Pages
XML
Data Processing
Retrieval-Augmented Generation
Delivery Pipeline
Large Language Models
Spark
Knowledge Representation
Question Answering
Information Technology
Apache Flink
HuggingFace
Dask
Search Engines
Spacy
Document Classification
Software Version Control
Service Stack
Job description
Core Research Directions: Responsible for one or two of the following areas:
Knowledge Extraction & Structuring
- Research techniques for extracting structured knowledge from multi-source heterogeneous data (documents, web pages, databases, conversation logs)
- Design automated pipelines for entity recognition, relation extraction, and event detection
- Develop knowledge quality assessment and cleaning mechanisms to filter noise and conflicting information
- Explore LLM-assisted knowledge extraction methods, balancing automation efficiency with manual validation costs
- Research incremental knowledge extraction strategies to support continuous knowledge base updates and expansion
Knowledge Organization & Representation
- Design knowledge graph schemas and ontologies to build structured frameworks for domain knowledge
- Research Knowledge Embedding techniques to achieve fusion of knowledge and vector spaces
- Develop multi-level knowledge representation systems supporting coarse-to-fine granularity knowledge navigation
- Explore knowledge fusion and alignment techniques to resolve entity disambiguation and conflict resolution from multi-source knowledge
- Research knowledge version management and provenance mechanisms to ensure knowledge traceability
Knowledge Retrieval & Augmentation
- Optimize RAG (Retrieval-Augmented Generation) systems to improve retrieval accuracy and answer quality
- Research hybrid retrieval strategies combining vector search, keyword search, graph traversal, and other approaches
- Develop retrieval re-ranking algorithms to enhance Top-K result relevance
- Design retrieval-generation collaborative optimization mechanisms to reduce hallucinations and erroneous citations
- Explore retrieval feedback learning to continuously optimize retrieval strategies based on user behavior
Knowledge Reasoning & Question Answering
- Research knowledge graph-based reasoning techniques supporting multi-hop reasoning, logical reasoning, and causal reasoning
- Develop Complex QA systems supporting multi-condition and multi-step question answering
- Explore fusion methods combining LLMs with symbolic reasoning, leveraging advantages of both neural and symbolic approaches
- Design interpretability frameworks for reasoning processes, supporting answer provenance and reasoning chain visualization
- Research knowledge gap detection and active learning mechanisms to identify coverage blind spots in the knowledge base
Knowledge Update & Maintenance
- Design knowledge timeliness management mechanisms supporting knowledge expiration detection and automatic updates
- Research knowledge conflict detection and resolution strategies for handling contradictory information fusion
- Develop knowledge base health monitoring systems tracking coverage, accuracy, freshness, and other metrics
- Explore human-feedback-driven knowledge iteration mechanisms
- Research knowledge compression and summarization techniques to optimize storage efficiency and retrieval performance
Requirements
- Master's degree or above in Computer Science, Artificial Intelligence, Information Management, or related fields
- 3+ years of AI-related research or development experience with hands-on experience in knowledge graphs, RAG, or QA systems
- Publications in top-tier conferences (ACL, EMNLP, SIGIR, WWW, NeurIPS, etc.) preferred
Technical Skills
Programming & Engineering
- Proficient in Python with expertise in data processing and large-scale text processing techniques
- Familiar with mainstream NLP frameworks (spaCy, NLTK, HuggingFace Transformers, etc.)
- Experience with graph databases (Neo4j, NebulaGraph, JanusGraph, etc.)
- Familiar with vector databases (Milvus, Chroma, Weaviate, FAISS, etc.)
AI Expertise
- Deep understanding of core NLP technologies: entity recognition, relation extraction, text classification, semantic similarity
- Familiar with the full lifecycle of knowledge graph construction and application
- Proficient in RAG technology stack with hands-on experience in retrieval optimization, re-ranking, and answer generation
- Prior experience in vertical domain knowledge system construction and knowledge-driven LLM application deployment (e.g., healthcare, legal, finance, technology) preferred
- Familiar with multi-modal knowledge processing (text + image + table + structured data)
Data Processing
- Familiar with large-scale data processing technologies (Spark, Flink, Dask, etc.)
- Experience in data governance such as data cleaning, deduplication, and standardization preferred
- Familiar with common data formats and protocols (JSON, XML, RDF, OWL, etc.)
Research Capabilities
- Ability to conduct independent technical research, owning the full process from problem definition to solution deployment
- Strong literature review and summarization skills with ability to quickly absorb cutting-edge research findings
- Experimental design and evaluation capabilities, able to design proper comparative and ablation studies
Soft Skills
- Strong interest in the intersection of knowledge engineering and AI, keeping up with latest domain developments
- Excellent communication and collaboration skills, able to work efficiently with engineering and product teams
- Systems thinking ability to approach knowledge system design from an overall architecture perspective
About the company
About GenScript GenScript Biotech Corporation (Stock Code: 1548.HK) is a global biotechnology group. Founded in 2002, GenScript has an established global presence across North America, Europe, the Greater China, and Asia Pacific. GenScript's businesses encompass four major categories based on its leading gene synthesis technology, including operation as a Life Science CRO, enzyme and synthetic biology products, biologics development and manufacturing, and cell therapy. GenScript is committed to striving towards its vision of being the most reliable biotech company in the world to make humans and nature healthier through biotechnology., GenScript Biotech Corporation (HK.1548) is an important technology and service provider in the world for life science R&D and manufacture. Built upon its solid DNA synthesis technology, GenScript Biotech comprises four major business units: a life-science services and products business unit, a biologics contract development and manufacturing organization (CDMO) business unit, an industrial synthetic products business unit, and an integrated global cell therapy company.
GenScript Biotech was founded in New Jersey, USA in 2002 and listed on the Hong Kong Stock Exchange in 2015. The company's business operations span over 100 countries and regions worldwide with legal entities located in the U.S., China, Japan, Singapore, Netherlands, Ireland, the United Kingdom, Korea, Belgium and Spain. GenScript Biotech provides premium, convenient and reliable services and products for over 200,000 customers.
As of June 30, 2024, GenScript Biotech had more than 6,900 employees globally, and 103,600 peer-reviewed journal articles worldwide had cited GenScript Biotech's services and products. In addition, GenScript Biotech owns a number of intellectual property rights, including over 300 patents, over 900 patent applications and great numbers of know-how secrets.
Driven by the corporate mission of "make people and nature healthier through biotechnology", GenScript Biotech strives to become the most trustworthy biotech company in the world.