Big Data (Python/Scala) Engineer -Assistant Vice...

Citigroup, Inc.
Tampa, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 145K

Job location

Tampa, United States of America

Tech stack

Agile Methodologies
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Apache HTTP Server
Azure
Big Data
Google BigQuery
Computer Programming
Continuous Integration
Data Integration
ETL
Relational Databases
Database Queries
Dimensional Modeling
Memory Management
Elasticsearch
Fault Tolerance
Github
Hadoop
Hadoop Distributed File System
MapReduce
Hive
IT Management
Job Scheduling
Python
Maven
MongoDB
NumPy
Object-Oriented Software Development
Apache Oozie
Performance Tuning
Scrum
Logstash
Prometheus
Cloudera
Mesos
Shell Script
Software Engineering
SQL Databases
Sqoop
Workflow Management Systems
Apache Zookeeper
Parquet
Data Logging
Data Processing
Google Cloud Platform
Data Storage Technologies
Data Ingestion
Apache Yarn
Grafana
Spark
Hdinsight
Caching
Gitlab
GIT
Pandas
Data Lake
Gitlab-ci
Apache Flume
Kubernetes
Avro
Kafka
Build Tools
Bitbucket
Spark Streaming
Sbt (Software)
Functional Programming
Kibana
Tez (Software)
Software Version Control
Data Pipelines
ELK
Jenkins
Databricks

Job description

  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, andestablishand implement new or revised applications systems and programs to meet specific business needs or user areas

  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users

  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement

  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality

  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install andassistcustomer exposure systems

  • Ensure essential procedures are followed and help define operating standards and processes

  • Serve as advisor or coach to new orlower levelanalysts

Requirements

  • 5-8 years of relevant experience

  • In-depth understanding of HDFS architecture, data storage, and fault tolerance mechanisms. Experience with HDFS commands and administration.

  • Solid understanding of YARN resource management and job scheduling.

  • Fundamental understanding of MapReduce programming paradigm, even if primary development is in Spark/Flink.Knowledge of Zookeeper for distributed coordination services.

  • Strongproficiencyin Spark Core, Spark SQL, Spark Streaming, and Spark GraphX (beneficial) .Expert-level programming skills in Scala , specifically for developing Spark applications.

  • Experience with Spark performance optimization techniques (e.g., caching, partitioning, shuffle optimizations, memory management).Familiarity with deploying Spark applications on YARN, Mesos, or Kubernetes.

  • Advancedproficiencyin writing complex HiveQL queries for data analysis and ETL processes.Understanding of Hivemetastore, execution engines (MapReduce, Tez, Spark), and storage formats (ORC, Parquet, Avro).Experience inoptimizingHive queries and table designs for performance.

  • Strong object-oriented and functional programming skills.Experience with Scala build tools (SBT, Maven).Knowledge of common Scala libraries and frameworks.

  • Experience withPySparkfor data processing.Familiarity with data manipulation libraries (Pandas, NumPy).Scripting for automation and data orchestration.

  • Complex query writing, subqueries, window functions, and performance tuning.HBase (for real-time access to large datasets within Hadoop).Cassandra, MongoDB, or similar.Familiarity with RDBMS concepts and SQL for data integration.

  • Understanding of dimensional modeling, fact and dimension tables, star/snowflake schemas.

  • Data Ingestion Tools:Apache Sqoop,Apache Flume,Kafka

  • Workflow Orchestration:Apache Oozie,Apache Airflow

  • Experience with AWS (EMR, S3, Glue, Lambda), Azure (HDInsight, Data Lake, Databricks), or Google Cloud Platform (Dataproc,BigQuery).

Tools and Methodologies

  • Version Control: Git (GitHub, GitLab, Bitbucket).

  • CI/CD: Experience with Jenkins, GitLab CI, Azure DevOps, or similar tools.

  • Monitoring and Logging: ELK Stack (Elasticsearch, Logstash, Kibana), Grafana, Prometheus.

  • Agile Development: Familiarity with Agile/Scrum methodologies.

  • Shell Scripting:For automation and system administration tasks.

Education:

  • Bachelor's degree/University degree or equivalent experience

Benefits & conditions

$96,960.00 - $145,440.00

In addition to salary, Citi's offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.

About the company

Citi, the leading global bank, has approximately 200 million customer accounts and does business in more than 160 countries and jurisdictions. Citi provides consumers, corporations, governments, and institutions with a broad range of financial products and services, including consumer banking and credit, corporate and investment banking, securities brokerage, transaction services, and wealth management. As a bank with a brain and a soul, Citi creates economic value that is systemically responsible and in our clients' best interests. As a financial institution that touches every region of the world and every sector that shapes your daily life, our Enterprise Operations & Technology teams are charged with a mission that rivals any large tech company. Our technology solutions are the foundations of everything we do from keeping the bank safe, managing global resources, and providing the technical tools our workers need to be successful to designing our digital architecture and ensuring our platforms provide a first-class customer experience. We reimagine client and partner experiences to deliver excellence through secure, reliable, and efficient services. Our commitment to diversity includes a workforce that represents the clients we serve from all walks of life, backgrounds, and origins. We foster an environment where the best people want to work. We value and demand respect for others, promote individuals based on merit, and ensure opportunities for personal development are widely available to all. Ideal candidates are innovators with well-rounded backgrounds who bring their authentic selves to work and complement our culture of delivering results with pride. If you are a problem solver who seeks passion in your work, come join us. We'll enable growth and progress together.

Apply for this position