Technical Architect-Datawarehousing

Tata Consultancy Services Limited
Marlborough, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 150K

Job location

Marlborough, United States of America

Tech stack

Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Business Logic
Azure
Cloud Computing
Code Review
Computer Programming
Continuous Integration
Information Engineering
Data Governance
ETL
Data Masking
Data Warehousing
Software Debugging
Software Design Patterns
DevOps
Dimensional Modeling
Github
Hive
Python
Performance Tuning
Software Construction
Software Engineering
Data Streaming
Data Storage Technologies
Sql Optimization
Spark
Infrastructure as Code (IaC)
GIT
Data Lake
PySpark
Information Technology
Data Lineage
Terraform
Software Version Control
Data Pipelines
Databricks

Job description

using PySpark, Spark SQL, and Delta Live Tables to ingest data from sources such as Point-of-Sale (POS), e-commerce platforms, loyalty systems, and marketing clouds. Data Modeling and Transformation: Implement complex data transformations and business logic within the Medallion architecture (Bronze, Silver, Gold layers). Build and optimize the final "Gold" customer-dimension tables that will serve as the single source of truth. Data Quality: Implement data quality frameworks and cleansing routines to ensure the accuracy and trustworthiness of the Customer 360 data. Performance Optimization: Proactively monitor, debug, and tune Databricks jobs and Spark clusters for performance and cost-efficiency. Implement best practices for partitioning, caching, and data layout in Delta Lake. Infrastructure as Code (IaC) & CI/CD: Work with DevOps teams to manage Databricks environments, clusters, and job deployments using tools like Terraform and AWS DevOps/GitHub Actions. Champion and implement CI/CD best practices for data pipelines. Data Governance and Security: Implement data governance features within Databricks Unity Catalog, including data lineage tracking, access controls, and data masking to ensure compliance and security. Collaboration: Partner closely with Functional Consultants, Data Scientists, and Analytics Engineers to understand their data requirements and deliver well-structured, consumption-ready datasets.

Requirements

Experience: 5+ years of hands-on data engineering experience, with at least 3 years focused on the Databricks/Spark Ecosystem Databricks Expertise: Deep, hands-on expertise with the Databricks Lakehouse Platform, including Delta Lake, Structured Streaming, Delta Live Tables, and cluster configuration/optimization. Programming Mastery: Expert-level proficiency in Python and PySpark. Advanced SQL skills are essential. Data Warehousing Concepts: Strong understanding of data modeling principles, including dimensional modeling (Kimball), data warehousing concepts, and ETL/ELT design patterns. Cloud Proficiency: Proven experience working with a major cloud provider (Azure, AWS, or GCP), particularly with data storage S3 and related services. Software Engineering Mindset: Experience with software engineering best practices, including version control (Git), code reviews, testing, and CI/CD., Bachelors, Qualifications : BACHELOR OF COMPUTER SCIENCE You must create an Indeed account before continuing to the company website to apply

Apply for this position