Senior Data Engineer

Royal Caribbean International
Miramar, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Miramar, United States of America

Tech stack

Java
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Data analysis
Automation of Tests
Azure
Big Data
Google BigQuery
Code Review
Databases
Continuous Integration
Couchbase
Data Architecture
Data Cleansing
Data Integration
Data Integrity
ETL
Data Transformation
Data Security
Data Structures
Data Virtualization
Data Warehousing
Relational Databases
Software Design Documents
Software Design Patterns
DevOps
Distributed Computing Environment
Monitoring of Systems
Python
Metadata
Meta-Data Management
Microsoft SQL Server
MySQL
NoSQL
Oracle Applications
Performance Tuning
Cloud Services
Standard Sql
Scala
PL-SQL
SQL Databases
Talend
T-SQL
Freeform SQL
Data Ingestion
GitHub Copilot
Snowflake
Spark
Generative AI
Data Lake
Information Technology
Kafka
Data Management
Database Replication
Machine Learning Operations
Data Pipelines
Redshift
Databricks

Job description

The Senior Data Engineer is responsible for delivering, managing, and operating scalable trusted data products and platforms that enable trusted analytics, AI/ML, and Generative AI use cases. The Senior Data Engineer is also responsible for leading the task of curating datasets and data pipelines created by various business departments, data scientists, and other technology teams. The Senior Data Engineer will also be responsible for using innovative and modern tools, techniques and architectures to automate the most common, repeatable and tedious data preparation and integration tasks to minimize manual and error-prone processes and improve productivity. The Senior Data Engineer develops and improves standards and procedures to support quality development, testing, and production support. The Senior Data Engineer will also act as an innovation catalyst-rapidly prototyping new approaches (i.e. automation, metadata-driven pipelines, and AI-enabled data experiences) and turning the best ideas into production-grade capabilities., * Designs and develops durable, flexible, and scalable data pipelines, data load processes and frameworks to automate the ingestion, processing and delivery of both structured and unstructured batch and real-time streaming data.

  • Develop reusable data products and curated datasets aligned to enterprise domains.
  • Implement modern ELT and distributed data processing patterns.
  • Conduct performance tuning of ETL processes for large volumes of data, develop and oversee monitoring systems to ensure data loads complete on schedule and data is accurate.
  • Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
  • Identifies ways to improve data reliability, efficiency and quality.
  • Creates and maintains technical design documentation.
  • Assists with requirements gathering.
  • Enable AI/ML and GenAI: Deliver governed training/inference datasets and feature foundations; partner with ML/AI engineers on data access patterns that support ML pipelines and production ML deployments
  • Identify opportunities to simplify architectures, automate manual processes, improve developer experience, and evaluate new tools/techniques through controlled prototypes
  • Participates in planning, applies design patterns, and performs code reviews.
  • Follows standards, processes and methodologies to develop each phase of data architecture (e.g. data manipulating processes, database technology generating processes).
  • Mentor junior engineers, raise the bar on best practices, and lead technical initiatives across teams and provides guidance.
  • Helps resolve issues regarding the implementation of data architecture components.
  • Applies DevOps principles to data pipelines to improve the cost, communication, integration, reuse and automation
  • Responsible for production support, including analyzing root cause and developing fixes to restore ETL and data operational readiness, planning and coordinating maintenance, conducting audits and validating jobs and data.
  • Position requires on-call and off-hours support.

Requirements

  • Bachelor or Master of Science in Engineering, Computer Science, Information Technology or equivalent
  • 6+ years of experience in Data Warehouse design and data modeling patterns (relational and dimensional)
  • 6+ years of experience with ETL tool development such as Talend or ADF
  • Must have strong analytical skills for effective problem solving
  • Ability to work independently, handle multiple tasks simultaneously and adapt quickly to change with a variety of people and work styles.
  • Must be capable of fully articulating concisely technical concepts to non-technical audiences.
  • Hands-on experience with at least one major cloud (AWS/Azure/GCP) and one warehouse/lakehouse technology (e.g., Snowflake, BigQuery, Redshift, Databricks/Lakehouse)
  • Strong proficiency in Python and/or Java/Scala; ability to build maintainable services and libraries
  • Experience with GitHub Copilot and Databricks Assistant a plus
  • Experience building or operating streaming pipelines using Kafka/Kinesis/Pub/Sub
  • Experience with Spark (or equivalent) and a workflow orchestrator (e.g., Airflow) plus familiarity with CI/CD and automated testing
  • Experience partnering with data science/ML teams, supplying training-ready datasets/features, and designing data products that support ML in production

Knowledge and Skills:

  • Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management.
  • Strong experience with popular database programming languages including SQL, PL/SQL, T-SQL, others for relational databases
  • Strong experience in one of the following tools: ADF or Talend
  • Strong experience with relational SQL (Oracle, MSSQL, MySQL) and NoSQL databases such as Couchbase
  • Strong experience with various Data Management architectures like Data Warehouse, Data Lake and the supporting processes like Data Integration, Governance, Metadata Management
  • Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies. These should include [ETL/ELT, data replication/CDC, message-oriented data movement] and data ingestion and integration technologies such as stream data integration, and data virtualization.
  • Strong experience in working with and optimizing existing ETL processes and data integration and data preparation flows and helping to move them in production
  • Strong experience writing and optimizing advanced SQL queries in a business environment with large-scale, complex datasets
  • Strong experience of data warehousing and data lake best practices within the industry
  • Experience with cloud data platforms such as Databricks, Snowflake, BigQuery, or Redshift.
  • Strong experience and hands-on experience with scripting languages: Python, Scala, Java, etc ...
  • Working knowledge of relational and dimensional data modeling patterns.
  • Working knowledge of the essential elements of data architecture, platforms and products.
  • Working knowledge to build and launch new data models
  • Nice to have: Experience with unstructured document ingestion, chunking, embeddings, vector databases, and retrieval patterns
  • Addresses stakeholder concerns by utilizing business data modeling, including data entities, attributes and their relationships.

Benefits & conditions

Journey with us! Combine your career goals and sense of adventure by joining our exciting team of employees. Royal Caribbean Group is pleased to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world.

Apply for this position