TELECOMMUTE Databricks Manager

SumasEdge Corporation
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote

Tech stack

Artificial Intelligence
Azure
Data Architecture
ETL
Data Mapping
Database Schema
Identity and Access Management
Information Lifecycle Management
Role-Based Access Control
Software Tools
Cloud Services
SQL Databases
Google Cloud Platform
Data Ingestion
Data Lake
PySpark
Low Latency
Machine Learning Operations
Data Pipelines
Databricks

Job description

  • Standardize data using industry frameworks to ensure IT-related data alignment (infrastructure-related information, infrastructure capacity, security-related, application runtime data, IT monitoring-related information, and additional meta-data)

  • Support and provide best practices on data mapping

  • Establish multi-zone / Medallion architecture to drive data and cost optimizations:

  • Bronze (raw telemetry)

  • Silver (cleaned/normalized)

  • Gold (aggregated/KPIs)

Design for 500TB+/day ingestion scale

Define standards for:

  • Delta Lake usage including Delta Tables / DLT

  • Table optimization (Z-ordering, partitioning)

  • Data lifecycle management

  • User workflows and use cases across various areas including line of business and IT

  • Knowledge of various Databricks capabilities including data engineering tools, Mosaic (AI/ML tools), Autoloader, Unity Catalog, Delta Tables / DLT, query builder, workspace - schema - table structures, Autoloader, LakeFlow, Genie, DataBricks Workflows / Jobs and additional Databricks components

  • Support FinOps (usage and capabilities cost controls) related activities including management and optimizations of compute, storage and DBU usage

  • Support Unity Catalog buildout including IAM and RBAC

  • Support and lead expertise

  • Support user-related best practices including use cases across various stakeholder roles, governance, user support, SLO / SLA development, predictive alerting and anomaly detection

  • Support pattern development and optimizations for data ingestion including streaming, batch and incremental

  • Knowledge and expertise in various data pipeline approaches and platforms to ensure data quality, data optimizations and reductions, ETL functions, data protection and high throughput and low latency

  • Support and provide expertise on semantic models

  • Support schema

Requirements

  • Experience working across different functional (application, infrastructure, security, compliance / audit, operations and business domains

  • Strong communication and organizational skills

  • Support delivery and management of the enterprise lakehouse architecture and implementation on large-scale cloud data platforms (Databricks)

  • Experience with Databricks usage in hyperscaler environments (Azure, Google Cloud Platform and Azure)

  • Support and lead implementation of best practices standards for SQL/PySpark development and usage

Apply for this position