Data Engineer
The Rose
3 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
JuniorJob location
Remote
Tech stack
Java
Airflow
Amazon Web Services (AWS)
Azure
Cloud Computing
Databases
Data Validation
Data Infrastructure
ETL
Data Mining
Data Structures
Data Stores
Data Warehousing
Database Queries
Software Debugging
Document-Oriented Databases
Hadoop
Python
PostgreSQL
MySQL
Standard Sql
SQL Databases
Workflow Management Systems
Data Processing
Google Cloud Platform
Data Storage Technologies
Data Ingestion
Snowflake
Spark
Data Lake
Google BigQuery
Kafka
Data Management
Video Streaming
Data Pipelines
Docker
Redshift
Job description
A Junior Data Engineer is responsible for building, maintaining, and optimizing data pipelines and data infrastructure. This role focuses on collecting, transforming, and storing data to support analytics, reporting, and business intelligence., * Develop and maintain data pipelines for data ingestion and processing
- Extract, transform, and load (ETL) data from multiple sources
- Design and manage data storage solutions (data warehouses, data lakes)
- Write efficient SQL queries for data extraction and transformation
- Ensure data quality, integrity, and consistency across systems
- Monitor and troubleshoot data pipeline issues
- Collaborate with data analysts and data scientists to meet data needs
- Optimize data workflows for performance and scalability
- Implement data validation and error-handling mechanisms
- Document data processes and pipeline architectures
Requirements
- Strong knowledge of SQL for data querying and transformation
- Proficiency in Python or Java for data processing
- Understanding of ETL processes and data pipelines
- Familiarity with databases (MySQL, PostgreSQL)
- Basic knowledge of data warehousing concepts
- Understanding of data structures and algorithms
- Problem-solving and debugging skills, * Experience with big data tools (Apache Spark, Hadoop)
- Familiarity with cloud platforms (AWS, Azure, Google Cloud Platform)
- Knowledge of data warehouse tools (Amazon Redshift, Google BigQuery, Snowflake)
- Exposure to streaming technologies (Kafka)
- Understanding of workflow orchestration tools (Apache Airflow)
- Basic knowledge of Docker and CI/CD pipelines