Sr. AWS Data Engineer

Cognizant Technology Solutions Corporation

Charlotte, United States of America

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charlotte, United States of America

Tech stack

Airflow

Amazon Web Services (AWS)

Big Data

Code Review

Databases

Information Engineering

ETL

Relational Databases

Software Design Patterns

Amazon DynamoDB

Identity and Access Management

Python

PostgreSQL

Metadata

Microsoft SQL Server

NoSQL

Performance Tuning

Query Optimization

SQL Stored Procedures

SQL Databases

SQL Server Integration Services

Systems Integration

Test Case Design

Management of Software Versions

Data Logging

Data Processing

Scripting (Bash/Python/Go/Ruby)

Informatica Powercenter

Spark

State Machines

AWS Lambda

Change Data Capture

GIT

Pytest

Data Lake

PySpark

Integration Tests

Apache Flink

Amazon Web Services (AWS)

Kafka

Data Management

Cloudwatch

Software Version Control

Data Pipelines

Serverless Computing

Job description

Design, build, and operate scalable, cloud-native data platforms supporting batch and streaming use cases, with strong focus on governance, performance, and reliability., * Data quality and validation: Implementing data quality checks, reconciliation logic, and exception handling within pipelines

Metadata-driven frameworks: Building configurable pipelines driven by metadata stored in Aurora or DynamoDB
Logging and observability: Integrating CloudWatch logging, custom metrics, and alerting into data pipelines
Unit and integration testing: Writing test cases for ETL logic using frameworks such as pytest
Version control: Proficiency with Git for source code management, branching strategies, and code reviews

Requirements

Python: Strong hands-on experience with Python for data engineering tasks, including scripting, automation, and ETL logic development
PySpark: Proficiency in writing and optimizing PySpark jobs for large-scale data transformations
SQL: Advanced SQL skills for data querying, transformation logic, and stored procedure conversion from SQL Server
Big Data Processing Frameworks
Apache Spark: Strong experience with Spark core concepts - RDDs, DataFrames, Datasets, partitioning, and performance tuning
Data partitioning and optimization: Experience with data skew handling, broadcast joins, caching strategies, and Spark tuning
AWS Services (Hands-On Experience Required)
AWS Glue ETL: Developing and deploying Glue jobs (Python Shell and Spark), job bookmarks, dynamic frames, and custom connectors
AWS Glue Data Catalog: Managing databases, tables, crawlers, classifiers, and schema versioning
AWS Lake Formation: Configuring data lake permissions, fine-grained access control, and data filtering
AWS Step Functions: Designing and implementing state machines for ETL workflow orchestration, error handling, and retry logic
AWS Lambda: Writing serverless functions for event-driven triggers, lightweight transformations, and pipeline utilities
Amazon Aurora: Working with Aurora PostgreSQL compatible for relational data storage and query optimization
Amazon DynamoDB: Designing and querying NoSQL tables
Amazon S3: Proficiency in S3 data lake design - partitioning strategies, storage classes, lifecycle policies, and S3 event notifications
AWS IAM: Understanding of roles, policies, and least-privilege access patterns relevant to data pipeline security
ETL Development & Migration
Informatica PowerCenter (working knowledge): Ability to read and interpret Informatica workflows, sessions, mappings, and transformations to support conversion to AWS Glue
ETL framework development: Experience building reusable, configurable ETL frameworks with logging, error handling, retry mechanisms, and metadata-driven execution
Data pipeline design patterns: Familiarity with incremental loads, CDC (Change Data Capture), full loads, and SCD (Slowly Changing Dimensions)
SQL Server (working knowledge): Ability to understand SQL Server schemas, stored procedures, and SSIS packages for migration analysis, * 8+ years in IT related role

Strong hands on experience in AWS Cloud, SQL and Python
Good experience with Kafka/Flink, AWS Glue and Airflow

About the company

At Cognizant, we strive to provide flexibility wherever possible, and we are here to support a healthy work-life balance though our various wellbeing programs. Based on this role's business requirements, this is an onsite position requiring 5 days a week in a client or Cognizant office in Charlotte, NC. Cognizant is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law.