Data Engineer
Role details
Job location
Tech stack
Job description
Essentially looking for someone who has been using Databricks as a data dev - 5 years in Databricks, PySpark, Snowflake, Python, Spark (tuning and performance management), SQL, APIs experience, exposure to AWS Interviews: 2-3 rounds of interviews Virtual (Video) 1 prescreen (30 minutes with lead), live coding, whiteboarding, technical/behavioral questions This isnt his usual data analytics engineering role - solely focused on Databricks development Data modeling in Databricks - Databricks Lakehouse architecture, how the modeling fits into place Medallion Architecture - Bronze + Silver with little deviation APIs from social media, TikTok, Facebook, Instagram, etc., Contribute to maintaining, updating, and expanding existing Core Data platform data pipelines Build tools and services to support data discovery, lineage, governance, and privacy Collaborate with other software/data engineers and cross-functional teams Tech stack includes Airflow, Spark, Databricks, Delta Lake, Kubernetes and AWS Collaborate with product managers, architects, and other engineers to drive the success of the Core Data platform Contribute to developing and documenting both internal and external standards and best practices for pipeline configurations, naming conventions, and more Ensure high operational efficiency and quality of the Core Data platform datasets to ensure our solutions meet SLAs and project reliability and accuracy to all our stakeholders (Engineering, Data Science, Operations, and Analytics teams) Be an active participant and advocate of agile/scrum ceremonies to collaborate and improve processes for our team Engage with and understand our customers, forming relationships that allow us to understand and prioritize both innovative new offerings and incremental platform improvements Maintain detailed documentation of your work and changes to support data quality and data governance requirements
Requirements
Knowledge of marketing experience would be helpful, not required as long as they have strong experience with APIs Education: Bachelor's degree in science, Engineering REQUIRED, 7+ years of data engineering experience developing large data pipelines Strong SQL skills and ability to create queries to analyze complex datasets Hands-on production environment experience with distributed processing systems such as Spark, Experience with Databricks Experience with Snowflake a plus Deep Understanding of AWS or other cloud providers as well as infrastructure as code Familiarity with Data Modeling techniques and Data Warehousing standard methodologies and practices Strong algorithmic problem-solving expertise Excellent written and verbal communication Advance understanding of OLTP vs OLAP environments Willingness and ability to learn and pick up new skill sets Self-starting problem solver with an eye for detail and excellent analytical and communication skills Strong background in at least one of the following: distributed data proc