Data Engineer & Analytics Developer
New York, Inc.
yesterday
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Tech stack
Artificial Intelligence
Airflow
Big Data
Google BigQuery
Cloud Computing
Continuous Integration
Data Architecture
Information Engineering
Data Infrastructure
Data Transformation
Data Vault Modeling
Data Warehousing
Dimensional Modeling
Python
Modular Design
Performance Tuning
Query Optimization
SQL Databases
Tableau
Jupyter Notebook
Google Cloud Platform
Cloud Platform System
Sql Optimization
Gitlab
GIT
Data Layers
Star Schema
Google BigQuery
Machine Learning Operations
Terraform
Software Version Control
Data Pipelines
Api Management
Requirements
- Hands-on experience with SQL, Python, Google Cloud Platform, BigQuery, Cloud Composer, GCS, Cloud Functions, Jupyter notebooks not just listed as keywords but demonstrated through pipeline builds, dataset design, or architecture decisions in prior roles.
- Evidence of scalable data architecture thinking look for candidates who talk about reusable models, layered architectures (medallion, star schema), and consolidation rather than listing dozens of one-off projects.
- Tableau dashboard development paired with data modeling candidates who have built both the data layer and the visualization on top of it, not just one or the other, We are seeking a Data Engineer with strong analytics capabilities who can own the full data lifecycle from scalable pipeline development to polished Tableau dashboards. The ideal candidate is someone who thinks architecturally, designs datasets for reuse and longevity, and resists the urge to create one-off tables for every new request. They bring a builder's mindset grounded in efficiency, modularity, and long-term sustainability of the data platform. They must be highly proficient in both Python and SQL as their primary working languages and tableau for reporting. Core Competencies Data Engineering and Pipeline Development
- Deep, hands-on experience with Google BigQuery including dataset design, partitioning/clustering strategies, materialized views, and cost-optimization techniques.
- Proficiency in Cloud Composer (Apache Airflow) for orchestrating complex, production-grade data pipelines with proper scheduling, retry logic, and dependency management.
- Experience building and maintaining Vertex AI Pipelines for ML workflows and data transformation at scale.
- Advanced SQL skills able to write complex, performant, and maintainable queries across large datasets including window functions, CTEs, recursive queries, and query optimization.
- Strong Python proficiency comfortable building data transformation scripts, pipeline logic, custom Airflow operators, API integrations, and automation tooling.
Data Architecture and Scalable Design
- Proven ability to design layered data architectures using patterns such as Medallion (bronze/silver/gold), Dimensional Modeling (star schema), Data Vault, and targeted denormalization and knows when to apply each based on the use case.
- Track record of building modular, multi-purpose datasets rather than project-specific tables thinks in terms of canonical models and shared dimensions.
- Understands when to create new tables versus when to extend, view, or restructure existing assets to avoid unnecessary duplication and table sprawl.
- Applies best practices around naming conventions, schema organization, documentation, and lifecycle management so that the architecture remains navigable as it scales.
Tableau Dashboard Development
- Hands-on experience building production-quality Tableau dashboards from data source configuration and extract optimization to interactive visual design.
- Ability to translate business questions into clear, intuitive visualizations that non-technical stakeholders can self-serve from.
- Familiarity with Tableau performance tuning, published data sources, and server/cloud publishing workflows.
- Understands the relationship between upstream data modeling decisions and downstream dashboard performance designs the data layer with the visualization in mind.
Technical Stack
- Cloud Platform: Google Cloud Platform (Google Cloud Platform)
- Data Warehouse: BigQuery (advanced)
- Orchestration: Cloud Composer / Apache Airflow
- ML Pipelines: Vertex AI Pipelines
- Visualization: Tableau (Desktop, Server/Cloud)
- Languages: Python (advanced), SQL (advanced)
- Infrastructure: Terraform (preferred), GCS, Cloud Functions
- Version Control: Git / GitLab, * 5+ years in a data engineering role, with meaningful Google Cloud Platform/BigQuery experience.
- Advanced proficiency in Python and SQL as daily working languages.
- Demonstrated experience designing and maintaining shared, reusable data models in an enterprise or multi-team environment.
- Familiarity with data architecture patterns including Medallion, star schema, and Data Vault.
- Portfolio or examples of Tableau dashboards built on well-structured data layers.
- Familiarity with CI/CD practices for data pipelines and infrastructure-as-code concepts.
- Strong communicator who can work with cross-functional teams to gather requirements and translate them into scalable data solutions.
About the company
* Architecture-first thinking Before writing a single line of code, they ask: "Does this already exist? Can I extend what's here? Will this serve more than just today's ask?"
* Efficiency over volume Measures success not by how many tables or pipelines they create, but by how few they need to support a growing number of use cases.
* End-to-end ownership Comfortable moving from raw ingestion all the way through to a polished Tableau dashboard, understanding how each layer impacts the next.
* Pragmatic scalability Designs for the future without over-engineering for the present; builds foundations that can absorb new projects without architectural rework.