TELECOMMUTE Databricks Manager

SumasEdge Corporation

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Remote

Artificial Intelligence

Azure

Data Architecture

ETL

Data Mapping

Database Schema

Identity and Access Management

Information Lifecycle Management

Role-Based Access Control

Software Tools

Cloud Services

SQL Databases

Google Cloud Platform

Data Ingestion

Data Lake

PySpark

Low Latency

Machine Learning Operations

Data Pipelines

Databricks

Standardize data using industry frameworks to ensure IT-related data alignment (infrastructure-related information, infrastructure capacity, security-related, application runtime data, IT monitoring-related information, and additional meta-data)
Support and provide best practices on data mapping
Establish multi-zone / Medallion architecture to drive data and cost optimizations:
Bronze (raw telemetry)
Silver (cleaned/normalized)
Gold (aggregated/KPIs)

Design for 500TB+/day ingestion scale

Define standards for:

Delta Lake usage including Delta Tables / DLT
Table optimization (Z-ordering, partitioning)
Data lifecycle management
User workflows and use cases across various areas including line of business and IT
Knowledge of various Databricks capabilities including data engineering tools, Mosaic (AI/ML tools), Autoloader, Unity Catalog, Delta Tables / DLT, query builder, workspace - schema - table structures, Autoloader, LakeFlow, Genie, DataBricks Workflows / Jobs and additional Databricks components
Support FinOps (usage and capabilities cost controls) related activities including management and optimizations of compute, storage and DBU usage
Support Unity Catalog buildout including IAM and RBAC
Support and lead expertise
Support user-related best practices including use cases across various stakeholder roles, governance, user support, SLO / SLA development, predictive alerting and anomaly detection
Support pattern development and optimizations for data ingestion including streaming, batch and incremental
Knowledge and expertise in various data pipeline approaches and platforms to ensure data quality, data optimizations and reductions, ETL functions, data protection and high throughput and low latency
Support and provide expertise on semantic models
Support schema

Experience working across different functional (application, infrastructure, security, compliance / audit, operations and business domains
Strong communication and organizational skills
Support delivery and management of the enterprise lakehouse architecture and implementation on large-scale cloud data platforms (Databricks)
Experience with Databricks usage in hyperscaler environments (Azure, Google Cloud Platform and Azure)
Support and lead implementation of best practices standards for SQL/PySpark development and usage