Data Engineer

Axiom Software Solutions

Barcelona, Spain

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Barcelona, Spain

Tech stack

Airflow

Amazon Web Services (AWS)

Big Data

Computer Programming

Databases

Continuous Integration

Data Architecture

Data Governance

ETL

Data Security

Data Warehousing

DevOps

Hadoop

Hive

Python

Performance Tuning

SQL Databases

Data Streaming

Data Processing

PySpark

Information Technology

Kafka

Data Management

Data Pipelines

Programming Languages

Job description

Seeking a skilled Data Engineer with a robust background in PySpark and extensive experience with AWS services, including Athena and EMR. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring efficient and reliable data flow and transformation., * Data Pipeline Development: Design, develop, and maintain scalable data pipelines using PySpark to process and transform large datasets.

AWS Integration: Utilize AWS services, including Athena and EMR, to manage and optimize data workflows and storage solutions.
Data Management: Implement data quality, data governance, and data security best practices to ensure the integrity and confidentiality of data.
Performance Optimization: Optimize and troubleshoot data processing workflows for performance, reliability, and scalability.
Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
Documentation: Create and maintain comprehensive documentation of data pipelines, ETL processes, and data architecture.

Requirements

Education: Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Experience: 5+ years of experience as a Data Engineer or in a similar role, with a strong emphasis on PySpark.
Technical Expertise:

o Proficient in PySpark for data processing and transformation.

o Extensive experience with AWS services, specifically Athena and EMR.

o Strong knowledge of SQL and database technologies.

o Experience with Apache Airflow is a plus

o Familiarity with other AWS services such as S3, Lambda, and Redshift.

Programming: Proficiency in Python; experience with other programming languages is a plus.
Problem-Solving: Excellent analytical and problem-solving skills with attention to detail.
Communication: Strong verbal and written communication skills to effectively collaborate with team members and stakeholders.
Agility: Ability to work in a fast-paced, dynamic environment and adapt to changing priorities.

Preferred Qualifications: