Data Engineer - Hadoop OzoneCH

Mphasis

Fanwood, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Fanwood, United States of America

Tech stack

Java

Apache HTTP Server

Big Data

Business Process Modeling

Cloud Computing

Data Validation

Information Engineering

ETL

Data Transformation

Data Security

Data Systems

Distributed Computing Environment

Fault Tolerance

Hadoop

Hadoop Distributed File System

MapReduce

HBase

Hive

Python

Shell Script

Software Engineering

SQL Databases

Data Streaming

Workflow Management Systems

Enterprise Software Applications

Apache Yarn

Spark

Documentation System

Containerization

Kubernetes

Information Technology

Apache Flink

Real Time Data

Kafka

Docker

Job description

We are seeking a highly skilled Big Data Engineer with strong experience in Apache Spark, Hadoop ecosystem, and Apache Ozone. The ideal candidate will design, develop, and optimize large-scale data processing systems, ensuring high performance, scalability, and reliability for enterprise-level applications., * Design and implement distributed data processing solutions using Apache Spark, Hadoop, Flink

Develop and maintain Spark applications for data transformation, aggregation, and ETL processes using Scala, Java, or Python
Utilize Apache Ozone for storing large-scale datasets, ensuring efficient data access and management in a distributed environment
Manage and optimize HDFS and Apache Ozone, Kafka for scalable and fault-tolerant storage.
Develop ETL pipelines for batch and real-time data ingestion and transformation.
Implement and ensure data validation, data security, integrity, and compliance across big data platforms.
Monitor and troubleshoot performance issues in large-scale clusters.
Collaborate with data scientists, analysts, and application teams to deliver high-quality data solutions.
Automate workflows and improve operational efficiency using scripting and orchestration tools.

Requirements

Strong expertise in Apache Spark (Core, SQL, Streaming).
Hands-on experience with Hadoop ecosystem (HDFS, YARN, MapReduce).
Proficiency in Apache Ozone for object storage and integration with Hadoop.
Solid programming skills in Java , Scala , or Python.
Experience with Hive, HBase , and Kafkais a plus.
Knowledge of cluster management and resource optimization.
Familiarity with Linux/Unix environments and shell scripting.
Understanding of data security, governance, and compliance standards.
Experience with cloud-based big data platforms
Exposure to containerization (Docker, Kubernetes) for big data workloads.
Knowledge of CI/CD pipelines for data engineering projects.

Behavioral Skills:

Good Communication skills
5 days Work from Office at Berkley Heights, NJ
Team Player
Ability to work in a changing environment
Strong problem solving and analytical skills
Ability to work independently or within a team
Manage day-to-day challenges and communicate developmental risks with the technical team

Qualifications:

Bachelor's degree in computer science, Software Engineering, or a related field.
Proficiency in business process modeling and documentation tools.
Product implementation experience is preferred

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all