Big Data / Real-Time Data Engineer
NEXXORA INC
Westlake, United States of America
5 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Westlake, United States of America
Tech stack
Java
Airflow
Amazon Web Services (AWS)
Azure
Big Data
Data as a Services
Data Architecture
Data Governance
Data Integration
ETL
Data Systems
Distributed Systems
Hadoop
Hadoop Distributed File System
Hive
Python
NoSQL
Performance Tuning
SQL Databases
Data Processing
Google Cloud Platform
Cloud Platform System
Spark
Containerization
Data Lake
Kubernetes
Apache Flink
Data Analytics
Kafka
Spark Streaming
Machine Learning Operations
Video Streaming
Stream Processing
Data Pipelines
Docker
Job description
We are seeking a highly skilled Senior Big Data Developer with strong expertise in building real-time data pipelines, advanced analytics, and data science platforms. The ideal candidate will have deep experience with big data ecosystems (including Hadoop-based frameworks) and modern streaming technologies, enabling scalable, high-performance data solutions., * Design, develop, and maintain real-time and batch data pipelines
- Work with large-scale distributed systems using Big Data frameworks (e.g., Hadoop ecosystem)
- Build and optimize data processing solutions using streaming technologies such as Kafka, Spark Streaming, or Flink
- Collaborate with data scientists and analysts to enable advanced analytics and machine learning workflows
- Develop and maintain data models, ETL/ELT pipelines, and data integration solutions
- Ensure data quality, performance optimization, and scalability
- Work with cloud-based data platforms (AWS, Azure, or Google Cloud Platform)
- Implement best practices for data governance, security, and compliance
Requirements
- Strong experience with Big Data frameworks (Hadoop, Spark, Hive, HDFS)
- Hands-on expertise in real-time data processing (Kafka, Spark Streaming, Flink, etc.)
- Proficiency in programming languages such as Python, Java, or Scala
- Experience with data analytics and data science platforms
- Solid understanding of distributed computing and data architecture
- Experience with SQL and NoSQL databases
- Familiarity with ETL tools and data pipeline orchestration (Airflow, etc.), * Experience with cloud platforms (AWS, Azure, Google Cloud Platform) and their data services
- Knowledge of machine learning workflows and MLOps
- Experience with containerization (Docker, Kubernetes)
- Exposure to Data Lake / Lakehouse architectures
- Strong problem-solving and communication skills