Data Engineer

Axiom Software Solutions
Barcelona, Spain
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Barcelona, Spain

Tech stack

Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Big Data
Computer Programming
Databases
Continuous Integration
Data Architecture
Data Governance
ETL
Data Security
Data Warehousing
DevOps
Hadoop
Hive
Python
Performance Tuning
SQL Databases
Data Streaming
Data Processing
PySpark
Information Technology
Kafka
Data Management
Data Pipelines
Programming Languages

Job description

Seeking a skilled Data Engineer with a robust background in PySpark and extensive experience with AWS services, including Athena and EMR. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring efficient and reliable data flow and transformation., * Data Pipeline Development: Design, develop, and maintain scalable data pipelines using PySpark to process and transform large datasets.

  • AWS Integration: Utilize AWS services, including Athena and EMR, to manage and optimize data workflows and storage solutions.

  • Data Management: Implement data quality, data governance, and data security best practices to ensure the integrity and confidentiality of data.

  • Performance Optimization: Optimize and troubleshoot data processing workflows for performance, reliability, and scalability.

  • Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.

  • Documentation: Create and maintain comprehensive documentation of data pipelines, ETL processes, and data architecture.

Requirements

  • Education: Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

  • Experience: 5+ years of experience as a Data Engineer or in a similar role, with a strong emphasis on PySpark.

  • Technical Expertise:

o Proficient in PySpark for data processing and transformation.

o Extensive experience with AWS services, specifically Athena and EMR.

o Strong knowledge of SQL and database technologies.

o Experience with Apache Airflow is a plus

o Familiarity with other AWS services such as S3, Lambda, and Redshift.

  • Programming: Proficiency in Python; experience with other programming languages is a plus.

  • Problem-Solving: Excellent analytical and problem-solving skills with attention to detail.

  • Communication: Strong verbal and written communication skills to effectively collaborate with team members and stakeholders.

  • Agility: Ability to work in a fast-paced, dynamic environment and adapt to changing priorities.

Preferred Qualifications:

  • Experience with data warehousing solutions and BI tools.

  • Knowledge of other big data technologies such as Hadoop, Hive, and Kafka.

  • Understanding of data modeling, ETL processes, and data warehousing concepts.

  • Experience with DevOps practices and tools for CI/CD.

Apply for this position