Data Engineer (Agent Systems)

Faraday Future

El Segundo, United States of America

1 month ago

Role details

Contract type

Internship / Graduate position

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

Chinese, English

Experience level

Junior

Compensation

$ 95K

Job location

El Segundo, United States of America

Tech stack

Query Performance

API

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Data analysis

Automated Storage and Retrieval Systems

Google BigQuery

Cloud Computing

Encodings

Databases

Continuous Integration

Information Engineering

Relational Databases

Linux

DevOps

Python

Knowledge-Based Systems

PostgreSQL

Metadata

MySQL

Open Source Technology

Role-Based Access Control

Redis

Software Engineering

Data Streaming

Toolchain

Usage Analysis

Parquet

Software Repository

Data Processing

Snowflake

Spark

Indexer

Backend

GIT

Kubernetes

Information Technology

Data Lineage

Apache Flink

Kafka

Free and Open-Source Software

Vertica

Virtual Agents

Data Pipelines

Docker

Job description

As a Data Engineer (Agent Systems) in the Developer Portal & Infrastructure Department, you will help build the data foundation for FF's EAI Developer Platform, Open Developer Platform, and agent-system capabilities across its robotics and vehicle businesses. You will work with AI agent, backend, platform, and security teams to support data pipelines, RAG knowledge infrastructure, developer portal analytics, platform telemetry, and data quality systems that power Developer Portal, SDK/API, Data Engine, DevOps & Toolchain, and Agent Skill Store experiences., * Build and maintain batch and streaming data pipelines for developer platform data, including documentation, SDK/API metadata, developer activity, platform telemetry, system logs, feedback signals, and business events.

Support RAG and agent knowledge systems by preparing, cleaning, chunking, embedding, indexing, and refreshing data from technical documents, API references, code repositories, and developer support content.
Design and evolve schemas, data models, and storage patterns across relational databases, analytical stores, object storage, and vector retrieval systems.
Create and maintain datasets for developer portal analytics, platform health monitoring, API usage analysis, Data Engine reporting, and open-source ecosystem insights.
Validate data correctness, freshness, latency, and retrieval quality for production agent workflows and developer-facing platform features.
Improve data quality and reliability through monitoring dashboards, validation checks, freshness alerts, lineage tracking, incident reviews, and operational runbooks.
Follow security, privacy, and compliance practices for access control, retention, auditability, sensitive data handling, and PII protection.

Requirements

Do you have experience in Schema design?, * Bachelor's degree in Computer Science, Engineering, Data Science, Mathematics, Statistics, or a related technical field.

0-2 years of experience in data engineering, software engineering, platform engineering, or related areas; internships, research, open-source contributions, or strong academic projects count.
Solid programming skills in Python and strong working knowledge of SQL.
Basic experience with at least one data processing or orchestration tool, such as Kafka, Spark, Flink, Airflow, Dagster, or dbt.
Understanding of relational databases, schema design, joins, indexing, and query performance trade-offs.
Familiarity with Linux, Git, Docker, and modern software development workflows.
Interest in AI agents, RAG, developer platforms, open-source ecosystems, robotics, vehicles, or embodied intelligence.

Preferred Qualifications:

Professional working proficiency in Mandarin Chinese and English.
Exposure to analytical databases, warehouses, or storage systems such as ClickHouse, BigQuery, Snowflake, PostgreSQL, MySQL, Redis, S3, GCS, or Parquet.
Familiarity with vector retrieval and RAG infrastructure, including embedding pipelines, metadata filtering, hybrid search, reindexing, pgvector, FAISS, Milvus, Weaviate, or Pinecone.
Exposure to developer portals, SDK/API platforms, technical documentation platforms, DevOps toolchains, open-source communities, or Agent Skill Store-style ecosystems.
Familiarity with Kubernetes, cloud infrastructure, CI/CD, observability, data lineage, cost monitoring, or production incident response.
Basic understanding of privacy and compliance concepts such as GDPR, CCPA, data minimization, tokenization, pseudonymization, and role-based access control.

Benefits & conditions

Pulled from the full job description

401(k)
Health insurance
Vision insurance
Dental insurance, ($85,000 - $95,000 DOE), plus benefits and incentive plans

Perks + Benefits

Healthcare + dental + vision benefits (Free for you/discounted for family)
401(k) options
Casual dress code + relaxed work environment
Culturally diverse, progressive atmosphere

About the company

Faraday Future (FF) is a California-based embodied artificial intelligence ecosystem company, leveraging the latest technologies and world's best talent to realize exciting new possibilities in mobility and robotics. We're producing user-centric, technology-first vehicles and robots to establish new paradigms in human-AI interaction. We're not just seeking to change how our cars and robots work - we're seeking to change the way we drive and interact with machines. At FF, we're creating something new, something connected, and something with a true global impact.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all