Data Engineer (Agent Systems)

Faraday Future
El Segundo, United States of America
3 days ago

Role details

Contract type
Internship / Graduate position
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
Chinese, English
Experience level
Junior
Compensation
$ 95K

Job location

El Segundo, United States of America

Tech stack

Query Performance
API
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Data analysis
Automated Storage and Retrieval Systems
Google BigQuery
Cloud Computing
Encodings
Databases
Continuous Integration
Information Engineering
Relational Databases
Linux
DevOps
Python
Knowledge-Based Systems
PostgreSQL
Metadata
MySQL
Open Source Technology
Role-Based Access Control
Redis
Software Engineering
Data Streaming
Toolchain
Usage Analysis
Parquet
Software Repository
Data Processing
Snowflake
Spark
Indexer
Backend
GIT
Kubernetes
Information Technology
Data Lineage
Apache Flink
Kafka
Free and Open-Source Software
Vertica
Virtual Agents
Data Pipelines
Docker

Job description

As a Data Engineer (Agent Systems) in the Developer Portal & Infrastructure Department, you will help build the data foundation for FF's EAI Developer Platform, Open Developer Platform, and agent-system capabilities across its robotics and vehicle businesses. You will work with AI agent, backend, platform, and security teams to support data pipelines, RAG knowledge infrastructure, developer portal analytics, platform telemetry, and data quality systems that power Developer Portal, SDK/API, Data Engine, DevOps & Toolchain, and Agent Skill Store experiences., * Build and maintain batch and streaming data pipelines for developer platform data, including documentation, SDK/API metadata, developer activity, platform telemetry, system logs, feedback signals, and business events.

  • Support RAG and agent knowledge systems by preparing, cleaning, chunking, embedding, indexing, and refreshing data from technical documents, API references, code repositories, and developer support content.
  • Design and evolve schemas, data models, and storage patterns across relational databases, analytical stores, object storage, and vector retrieval systems.
  • Create and maintain datasets for developer portal analytics, platform health monitoring, API usage analysis, Data Engine reporting, and open-source ecosystem insights.
  • Validate data correctness, freshness, latency, and retrieval quality for production agent workflows and developer-facing platform features.
  • Improve data quality and reliability through monitoring dashboards, validation checks, freshness alerts, lineage tracking, incident reviews, and operational runbooks.
  • Follow security, privacy, and compliance practices for access control, retention, auditability, sensitive data handling, and PII protection.

Requirements

Do you have experience in Schema design?, * Bachelor's degree in Computer Science, Engineering, Data Science, Mathematics, Statistics, or a related technical field.

  • 0-2 years of experience in data engineering, software engineering, platform engineering, or related areas; internships, research, open-source contributions, or strong academic projects count.
  • Solid programming skills in Python and strong working knowledge of SQL.
  • Basic experience with at least one data processing or orchestration tool, such as Kafka, Spark, Flink, Airflow, Dagster, or dbt.
  • Understanding of relational databases, schema design, joins, indexing, and query performance trade-offs.
  • Familiarity with Linux, Git, Docker, and modern software development workflows.
  • Interest in AI agents, RAG, developer platforms, open-source ecosystems, robotics, vehicles, or embodied intelligence.

Preferred Qualifications:

  • Professional working proficiency in Mandarin Chinese and English.
  • Exposure to analytical databases, warehouses, or storage systems such as ClickHouse, BigQuery, Snowflake, PostgreSQL, MySQL, Redis, S3, GCS, or Parquet.
  • Familiarity with vector retrieval and RAG infrastructure, including embedding pipelines, metadata filtering, hybrid search, reindexing, pgvector, FAISS, Milvus, Weaviate, or Pinecone.
  • Exposure to developer portals, SDK/API platforms, technical documentation platforms, DevOps toolchains, open-source communities, or Agent Skill Store-style ecosystems.
  • Familiarity with Kubernetes, cloud infrastructure, CI/CD, observability, data lineage, cost monitoring, or production incident response.
  • Basic understanding of privacy and compliance concepts such as GDPR, CCPA, data minimization, tokenization, pseudonymization, and role-based access control.

Benefits & conditions

Pulled from the full job description

  • 401(k)
  • Health insurance
  • Vision insurance
  • Dental insurance, ($85,000 - $95,000 DOE), plus benefits and incentive plans

Perks + Benefits

  • Healthcare + dental + vision benefits (Free for you/discounted for family)
  • 401(k) options
  • Casual dress code + relaxed work environment
  • Culturally diverse, progressive atmosphere

About the company

Faraday Future (FF) is a California-based embodied artificial intelligence ecosystem company, leveraging the latest technologies and world's best talent to realize exciting new possibilities in mobility and robotics. We're producing user-centric, technology-first vehicles and robots to establish new paradigms in human-AI interaction. We're not just seeking to change how our cars and robots work - we're seeking to change the way we drive and interact with machines. At FF, we're creating something new, something connected, and something with a true global impact.

Apply for this position