Lead Data Engineer
Role details
Job location
Tech stack
Job description
The Lead Data Engineer plays a critical role in big data development within the data analytics engineering organization of MetLife Data & Analytics. This position has the responsibility for architecture & design of data and analytics solutions, building ETL, data warehousing, and reusable components using cutting-edge big data and cloud technologies. The role is based in Cary, NC, Tampa, FL, Wilmington, DE, Bridgewater, NJ, or NYC, NY, and supports MetLife's commitment to data-driven decision-making and operational excellence., * Design and execute a large-scale data migration initiative with a focus on defining migration architecture, ensuring data quality, and data reconciliation from various sources for consumption and reporting for analytics.
- Ingest huge volumes of data from various platforms for Analytics needs and write high-performance, scalable, reliable, and maintainable pipelines in Azure data bricks, Azure Data Factory, and related services.
- Developing reusable frameworks to manage complex data transformations, Validations, and data reconciliations.
- Develop quality code with thought-through performance optimizations in place right at the development stage.
- Appetite to learn new technologies and be ready to work on new cutting-edge cloud technologies.
- Work with teams spread across the globe in driving the delivery of projects and recommend development and performance improvements.
- Extensive experience with various database types and knowledge to leverage the right one for the need.
- Strong understanding of data tools and ability to leverage them to understand the data and generate insights
- Hands-on experience in building/designing at-scale Data Lake, Data warehouses, data stores for analytics consumption on Cloud platforms (real-time as well as batch use cases).
- Utilize Cloud technologies (preferably Azure Databricks) to enable PaaS-centric enterprise solutions.
- Implement solutions that support dynamic scaling, including throttling and bursting for high-volume data workloads.
- Establish and evangelize modern software development practices, including CI/CD, automated testing, and code quality standards
- Develop and support an API catalog for data services, ensuring standardization and security.
- Optimize reusable frameworks, Spark jobs for performance and cost efficiency in large-scale environments.
- Ability to interact with business analysts and functional analysts in getting the requirements and implementing ETL solutions.
Requirements
- Bachelor's/ master's degree in information technology/computer science or a relevant domain.
- Microsoft Azure Certifications and/or Databricks certifications.
- 10+ years of solutions development and delivery experience with 6+ years of recent experience in data engineering.
- Strong analytic skills related to working with unstructured datasets.
- Data architecture (traditional - examples include SQL Server + modern - examples include Azure) and knowledge of data architecture patterns.
- Strong experience in Azure data bricks, including Spark, Delta Lake, data stores for analytics consumption (real-time as well as batch use cases)
- Ability to interact with business analysts and functional analysts in getting the requirements and implementing the ELT solutions.
- Proficiency and extensive experience with Spark/Scala/Python and performance tuning
- Experience with APIs and web services for data exchange
- Hands-on expertise in building & implementing data ingestion, curation, and data integration processes developed using Cloud data tools such as Azure Databricks, SQL, Azure Data Factory, Spark (Scala/Python), Delta Lake, etc.
- Performance tuning on Azure data bricks, dedicated SQL Pool and server SQL Pools, APIs loading and consumption optimizations.
- Very good problem solver and excellent communication skills - both written and verbal
Preferred:
- Experience using data reconciliation frameworks and data migration automation tools.
- Good scripting experience primarily on shell/bash/ PowerShell would be desirable.
- Experience in large-scale ERP transformation programs.
- Code versioning experience using Azure DevOps. Working knowledge of Azure DevOps pipelines.
- Prior experience leveraging AI and ML capabilities to automate and optimize complex workflows with intelligent use of low or no-code.