Data Engineer
Role details
Job location
Tech stack
Job description
Position Overview: We are seeking an experienced Senior Data Engineer to design, build, and maintain enterprise-scale data platforms using modern cloud-native technologies. The ideal candidate will have deep expertise in Palantir Foundry, Databricks, and/or Snowflake, with a strong background in building scalable ETL/ELT pipelines and data warehousing solutions., Design and implement end-to-end data pipelines using Palantir Foundry, Databricks, and Snowflake Develop scalable ETL/ELT workflows using Python, PySpark, and SQL for processing large-scale datasets Build and maintain data lake and data warehouse architectures on cloud platforms (Azure (Data Factory, Synapse, ADLS), Databricks, Snowflake) Implement data governance, quality checks, and metadata management frameworks Create and optimize Databricks notebooks, Delta Lake tables, and Unity Catalog implementations Design Palantir Foundry Ontologies, and transformation pipelines Collaborate with data scientists, analysts, and business stakeholders to deliver data products Optimize query performance and storage strategies in Snowflake (clustering, partitioning, materialized views) Implement real-time streaming pipelines using Spark Streaming, or similar technologies Develop CI/CD pipelines for data workflows using Git, Jenkins, or similar tools
Requirements
Primary Platforms: Palantir Foundry, Databricks, Snowflake, IBM WatsonX.data Programming: Python, PySpark, SQL, Scala Data Processing: ETL/ELT pipelines, real-time streaming, batch processing Cloud: Azure (Data Factory, Synapse, ADLS), Databricks, Snowflake Tools: Airflow, dbt, Apache Spark, Delta Lake, Unity Catalog, 3+ years of hands-on data engineering experience Expert-level proficiency in Python and PySpark for data processing Strong SQL skills with experience in complex query optimization Hands-on experience with at least two of: Palantir Foundry, Databricks, Snowflake Experience with cloud platforms (Azure) and their data services Knowledge of data modeling (dimensional modeling, star schema, snowflake schema) Experience with orchestration tools (Airflow, dbt, Luigi, or similar) Strong understanding of data governance, security, and compliance requirements Preferred Qualifications: Experience with IBM WatsonX.data or similar AI data platforms Knowledge of Delta Lake, Iceberg, or similar lakehouse formats Experience with GenAI, RAG architectures, or LangChain Familiarity with MCP (Model Context Protocol) for AI agent integrations