AI Big Data Engineer

Manpower

Boone Township, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Boone Township, United States of America

Tech stack

Java

Agile Methodologies

Artificial Intelligence

Amazon Web Services (AWS)

Data analysis

Build Automation

Automation of Tests

Big Data

Configuration Management

Information Systems

Databases

Continuous Integration

Data Architecture

Data Integration

ETL

Database Queries

Distributed Systems

Hadoop

Hive

Python

Object-Oriented Software Development

Operational Databases

Performance Tuning

Scrum

Standard Sql

Scala

Simple Data Format

Software Engineering

SQL Databases

System Testing

Test Case Design

Workflow Management Systems

Data Ingestion

GitHub Copilot

Prompt Engineering

Spark

Caching

Information Technology

Functional Programming

GPT

Data Pipelines

Serverless Computing

Job description

We are seeking a highly skilled and experienced AI Big Data Engineer to design, develop, and optimize large-scale data processing systems.

Position requires 3-days/week onsite in Rockville, MD or Tysons Corner, VA

In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms.

The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark etc

Responsibilities:

Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala).
Implement data ingestion, storage, transformation, and analysis of solutions that are scalable, efficient, and reliable.
Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture
Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.
Optimize and enhance existing data pipelines for performance, scalability, and reliability.
Develop automated testing frameworks and implement continuous testing for data quality assurance.
Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines.
Work with data scientists and analysts to support data-driven decision-making across the organization.
Ability to write and maintain automated unit, integration, and end-to-end tests
Monitor and troubleshoot data pipelines in production environments to identify and resolve issues.

Requirements

Bachelor's degree in Computer Science, Information Systems or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience
Master's degree and past Financial Services industry experience preferred.

Experience Requirements:

Demonstrated technical expertise in Object Oriented and database technologies/concepts which resulted in deployment of enterprise quality solutions.
Past experience with developing enterprise quality solutions in an iterative or Agile environment.
Extensive knowledge of industry leading software engineering approaches including Test Automation, Build Automation and Configuration Management frameworks.
Strong written and verbal technical communication skills.
Demonstrated ability to develop effective working relationships that improved the quality of work products.
Should be well organized, thorough, and able to handle competing priorities.
Ability to maintain focus and develop proficiency in new skills rapidly.
Ability to work in a fast-paced environment.
Experience with object-oriented programming languages such as Java, Scala or Python.

Essential Technical Skills:

AI Tool Proficiency: Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
Technical Background: Strong software development background with ability to contribute to technical discussions
Agile Methodology: Extensive experience with Scrum, Kanban, and continuous improvement practices

Big Data Technologies :

Experience with Big data technologies such as Hadoop, Spark, Hive & Trino
Evaluate understanding of common issues like:
Data skew and strategies to mitigate it.
Working with massive data volumes in PetaBytes.
Troubleshooting job failures due to resource limitations, bad data, scalability challenged.

AI Skills:

Prompt Engineering: Proficiency in crafting effective prompts for AI coding assistants and analysis tools.
AI Workflow Design: Experience redesigning development processes to leverage AI capabilities.
Data Analysis: Ability to interpret AI-generated insights and translate them into actionable team improvements.
Change Management: Experience leading teams through AI adoption and workflow transformation.

SQL Skills (Window Functions, Joins, Complex Queries):

SQL window functions, multi-table joins, aggregations.
Write/optimize SQL queries on the spot.

Apache Spark (Development, Internals & Tuning):

Understanding of Sparks core architecture - executors, tasks, stages, DAG.
Spark performance tuning techniques: partitioning, caching, broadcast joins, etc.
Experience optimizing Spark jobs for large-scale datasets.

Cloud Technologies - AWS:

Exposure to AWS services like S3, EMR, Glue, Lambda, Athena, etc.
S3 with Spark (e.g., dealing with file formats, consistency issues).
EKS, Serverless knowledge, etc.

Programming - Python or Scala :

Ability to write clean, modular, and performant code.
Functional programming concepts (e.g., immutability, higher-order functions).

Good to Have:

Experience with managing production data pipelines/ETL systems
Experience with CI/CD
Experience writing test cases
AWS certifications

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all