Data Platform Engineer
Role details
Job location
Tech stack
Job description
The Contractor shall deliver, but not limited to, the following:
- Administer Databricks account and workspaces across SDLC environments; standardize
configuration, naming, and operational patterns.
- Configure and maintain clusters/compute, job compute, SQL warehouses, runtime
versions, libraries, repos, and workspace settings.
- Implement platform monitoring/alerting, operational dashboards, and health checks;
maintain runbooks and operational procedures.
- Provide Tier 2/3 operational support: troubleshoot incidents, perform root-cause analysis,
and drive remediation and preventive actions.
- Manage change control for upgrades, feature rollouts, configuration changes, and
integration changes; document impacts and rollback plans.
- Enforce least privilege across platform resources (workspaces, jobs, clusters, SQL
warehouses, repos, secrets) using role/group-based access patterns.
- Configure and manage secrets and secure credential handling (secret scopes / key
management integrations) for platform and data connectivity.
- Enable and maintain audit logging and access/event visibility; support security reviews
and evidence requests.
- Administer Unity Catalog governance: metastores, catalogs/schemas/tables, ownership,
grants, and environment/domain patterns.
- Configure and manage external locations, storage credentials, and governed access to
cloud object storage.
- Partner with governance stakeholders to support metadata/lineage integration,
classification/tagging, and retention controls where applicable.
- Coordinate secure connectivity and guardrails with cloud/network teams: private
connectivity patterns, egress controls, firewall/proxy needs.
- Configure cloud integrations required for governed data access and service connectivity
(roles/permissions, endpoints, storage integrations).
- Implement cost guardrails: cluster policies, auto-termination, scheduling, workload sizing
standards, and capacity planning.
- Produce usage/cost insights and optimization recommendations; address waste drivers
(idle compute, oversized clusters, inefficient jobs).
- Automate administration and configuration using APIs/CLI/IaC (e.g., Terraform) to
reduce manual drift and improve repeatability.
- Maintain platform documentation: configuration baselines, security/governance
standards, onboarding guides, and troubleshooting references.
- Design and implement backup and disaster recovery procedures for workspace
configurations, notebooks, Unity Catalog metadata, and job definitions; maintain
recovery runbooks and perform periodic DR testing aligned to RTO/RPO objectives.
- Monitor and optimize platform performance, including SQL warehouse query tuning,
cluster autoscaling configuration, Photon enablement, and Delta Lake optimization
guidance (OPTIMIZE, VACUUM, Z-ordering strategies).
- Administer Delta Live Tables (DLT) pipelines and coordinate with data engineering
teams on pipeline health, data quality monitoring, failed job remediation, and pipeline
configuration best practices.
- Manage third-party integrations and ecosystem connectivity, including BI tool
integrations (e.g., Power BI), and external metadata catalog integrations.
- Implement Databricks Asset Bundles (DABs) for standardized deployment patterns;
automate workspace resource deployment (jobs, pipelines, dashboards) across SDLC
environments using bundle-based CI/CD workflows.
- Conduct capacity planning and scalability analysis, including forecasting concurrent
user/workload growth, platform scaling strategies, and proactive resource allocation
during peak usage periods.
- Facilitate user onboarding and enablement, including new user/team onboarding procedures, training coordination, workspace access provisioning, and creation of self-
Requirements
Hands-on experience administering Databricks (workspace administration,
clusters/compute policies, jobs, SQL warehouses, repos, runtime management) and
expertise using Databricks CLI.
- Strong Unity Catalog administration: metastores; catalogs/schemas; grants; service
principals; external locations; storage credentials; governed storage access.
- Identity & Access Management proficiency: SSO concepts, SCIM provisioning, group-
based RBAC, service principals, least-privilege patterns.
- Security fundamentals: secrets management, secure connectivity, audit logging, access
monitoring, and evidence-ready operations.
- Cloud platform expertise (AWS ): IAM roles/policies, object storage security patterns,
networking basics (VPC concepts), logging/monitoring integration.
- Automation skills: scripting and/or IaC using Terraform/CLI/REST APIs for repeatable
configuration and environment promotion.
- Experience implementing data governance controls (classification/tagging,
lineage/metadata integrations) in partnership with governance teams.
- CI/CD practices for jobs/notebooks/config promotion across SDLC environments.
- Understanding of lakehouse concepts (e.g., Delta, table lifecycle management, separation
of storage/compute).
- SQL proficiency and data engineering fundamentals for troubleshooting query
performance issues, understanding ETL/ELT workflow patterns, and debugging data
pipeline failures; basic Python/Scala familiarity for notebook/code issue diagnosis.
- Experience with compliance and regulatory frameworks (FedRAMP, HIPAA, SOC2, or
similar) including implementation of data residency requirements, retention policies, and
audit-ready evidence collection.
- Hands-on experience with AWS security and networking services including PrivateLink,
Secrets Manager/Systems Manager integration, CloudWatch/CloudTrail integration, S3
bucket policies, cross-account access patterns, and KMS encryption key management.
- Experience administering Databricks serverless compute, Workspace Git integrations
(GitLab), Databricks Asset Bundles (DABs) for deployment automation, and modern
workspace features supporting DevOps workflows.
- SLA/SLO management and stakeholder communication skills; ability to define platform
service levels, produce operational reports, translate technical issues to business
stakeholders, and manage vendor relationships (Databricks account teams).
Education / Experience/Certifications/Accreditations
- Bachelor's degree in a related field or equivalent practical experience.
- 7+ years in cloud/data platform administration and operations, including 4+ years
supporting Databricks or similar platforms.
- Databricks Platform Administrator/Databricks AWS Platform Architect
- Databricks Certified Data Engineer Associate/Professional
- AWS Certified Solutions Architect Associate or Professional