Technical Architect-Datawarehousing

Tata Consultancy Services Limited

Marlborough, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Compensation

$ 150K

Job location

Marlborough, United States of America

Tech stack

Amazon Web Services (AWS)

Data analysis

Business Logic

Azure

Cloud Computing

Code Review

Computer Programming

Continuous Integration

Information Engineering

Data Governance

ETL

Data Masking

Data Warehousing

Software Debugging

Software Design Patterns

DevOps

Dimensional Modeling

Github

Hive

Python

Performance Tuning

Software Construction

Software Engineering

Data Streaming

Data Storage Technologies

Sql Optimization

Spark

Infrastructure as Code (IaC)

GIT

Data Lake

PySpark

Information Technology

Data Lineage

Terraform

Software Version Control

Data Pipelines

Databricks

Job description

using PySpark, Spark SQL, and Delta Live Tables to ingest data from sources such as Point-of-Sale (POS), e-commerce platforms, loyalty systems, and marketing clouds. Data Modeling and Transformation: Implement complex data transformations and business logic within the Medallion architecture (Bronze, Silver, Gold layers). Build and optimize the final "Gold" customer-dimension tables that will serve as the single source of truth. Data Quality: Implement data quality frameworks and cleansing routines to ensure the accuracy and trustworthiness of the Customer 360 data. Performance Optimization: Proactively monitor, debug, and tune Databricks jobs and Spark clusters for performance and cost-efficiency. Implement best practices for partitioning, caching, and data layout in Delta Lake. Infrastructure as Code (IaC) & CI/CD: Work with DevOps teams to manage Databricks environments, clusters, and job deployments using tools like Terraform and AWS DevOps/GitHub Actions. Champion and implement CI/CD best practices for data pipelines. Data Governance and Security: Implement data governance features within Databricks Unity Catalog, including data lineage tracking, access controls, and data masking to ensure compliance and security. Collaboration: Partner closely with Functional Consultants, Data Scientists, and Analytics Engineers to understand their data requirements and deliver well-structured, consumption-ready datasets.

Requirements

Experience: 5+ years of hands-on data engineering experience, with at least 3 years focused on the Databricks/Spark Ecosystem Databricks Expertise: Deep, hands-on expertise with the Databricks Lakehouse Platform, including Delta Lake, Structured Streaming, Delta Live Tables, and cluster configuration/optimization. Programming Mastery: Expert-level proficiency in Python and PySpark. Advanced SQL skills are essential. Data Warehousing Concepts: Strong understanding of data modeling principles, including dimensional modeling (Kimball), data warehousing concepts, and ETL/ELT design patterns. Cloud Proficiency: Proven experience working with a major cloud provider (Azure, AWS, or GCP), particularly with data storage S3 and related services. Software Engineering Mindset: Experience with software engineering best practices, including version control (Git), code reviews, testing, and CI/CD., Bachelors, Qualifications : BACHELOR OF COMPUTER SCIENCE You must create an Indeed account before continuing to the company website to apply