Lead Data Engineer
Role details
Job location
Tech stack
Job description
Job Description Summary We are seeking a Lead Data Engineer with solid experience typically gained over a minimum of 5 years in large multinational companies within the energy sector or related industrial domains such as smart infrastructure or industrial automation, with a strong track record of building robust data infrastructures for AI/ML initiatives.
In this position, you will be responsible for designing and optimizing data pipelines and platforms that power AI solutions at the edge and in the cloud. You will collaborate closely with R&D, Grid Automation, and business units to deliver impactful, sustainable solutions across complex energy and industrial systems., Design and maintain database structures, schemas, and data models.
Apply appropriate storage technologies (Relational, NoSQL, Data Lakes, etc.) to ensure secure and efficient data management.
Build and manage scalable, reliable data pipelines for data cleaning, transformation, feature extraction, and processing of both structured and unstructured data.
Integrate data from internal and external APIs, ensuring seamless and automated data flows.
Identify and onboard new datasets that enhance our AI/ML capabilities and support product development.
Automate data integration processes and standardize data transformations based on business-specific needs.
Monitor and optimize pipeline performance to ensure scalability and efficiency.
Implement data quality checks and adhere to data governance best practices.
Collaborate closely with Data Scientists and ML Engineers to ensure delivery of high-quality, relevant data.
Work cross-functionally with Product Management, R&D, and Engineering to translate business needs into technical data solutions.
Requirements
Experience typically gained over +5 years in large multinational companies within the energy sector or related industrial domains such as smart infrastructure or industrial automation.
Bachelor's, Master's, or PhD in Computer Science, Electrical/Computer Engineering, or a related field with a focus on data engineering or electric power systems.
Hands-on experience building and managing production-grade data pipelines.
Proficiency in Python, SQL, and one additional language (e.g., Scala, Java).
Strong knowledge of relational databases (e.g., PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
Experience working with cloud platforms like AWS, Azure, or GCP for deploying data systems.
Solid understanding and hands-on experience with ETL/ELT processes and workflow automation.
Experience with data architectures supporting GenAI models.
Strong communication and collaboration skills; able to work cross-functionally in fast-paced environments.
Nice-to-Have Skills
Familiarity with big data technologies like Apache Spark, Kafka, or Hadoop.
Experience with data visualization tools (e.g., Tableau, Power BI) for reporting and dashboard creation.
Knowledge of Graph Databases, and cloud-based data warehousing solutions (e.g., Snowflake, Redshift).
Data storytelling and the ability to translate insights into actionable business recommendations.