Big Data (Python/Scala) Engineer -Assistant Vice...
Role details
Job location
Tech stack
Job description
-
Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, andestablishand implement new or revised applications systems and programs to meet specific business needs or user areas
-
Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
-
Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
-
Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
-
Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install andassistcustomer exposure systems
-
Ensure essential procedures are followed and help define operating standards and processes
-
Serve as advisor or coach to new orlower levelanalysts
Requirements
-
5-8 years of relevant experience
-
In-depth understanding of HDFS architecture, data storage, and fault tolerance mechanisms. Experience with HDFS commands and administration.
-
Solid understanding of YARN resource management and job scheduling.
-
Fundamental understanding of MapReduce programming paradigm, even if primary development is in Spark/Flink.Knowledge of Zookeeper for distributed coordination services.
-
Strongproficiencyin Spark Core, Spark SQL, Spark Streaming, and Spark GraphX (beneficial) .Expert-level programming skills in Scala , specifically for developing Spark applications.
-
Experience with Spark performance optimization techniques (e.g., caching, partitioning, shuffle optimizations, memory management).Familiarity with deploying Spark applications on YARN, Mesos, or Kubernetes.
-
Advancedproficiencyin writing complex HiveQL queries for data analysis and ETL processes.Understanding of Hivemetastore, execution engines (MapReduce, Tez, Spark), and storage formats (ORC, Parquet, Avro).Experience inoptimizingHive queries and table designs for performance.
-
Strong object-oriented and functional programming skills.Experience with Scala build tools (SBT, Maven).Knowledge of common Scala libraries and frameworks.
-
Experience withPySparkfor data processing.Familiarity with data manipulation libraries (Pandas, NumPy).Scripting for automation and data orchestration.
-
Complex query writing, subqueries, window functions, and performance tuning.HBase (for real-time access to large datasets within Hadoop).Cassandra, MongoDB, or similar.Familiarity with RDBMS concepts and SQL for data integration.
-
Understanding of dimensional modeling, fact and dimension tables, star/snowflake schemas.
-
Data Ingestion Tools:Apache Sqoop,Apache Flume,Kafka
-
Workflow Orchestration:Apache Oozie,Apache Airflow
-
Experience with AWS (EMR, S3, Glue, Lambda), Azure (HDInsight, Data Lake, Databricks), or Google Cloud Platform (Dataproc,BigQuery).
Tools and Methodologies
-
Version Control: Git (GitHub, GitLab, Bitbucket).
-
CI/CD: Experience with Jenkins, GitLab CI, Azure DevOps, or similar tools.
-
Monitoring and Logging: ELK Stack (Elasticsearch, Logstash, Kibana), Grafana, Prometheus.
-
Agile Development: Familiarity with Agile/Scrum methodologies.
-
Shell Scripting:For automation and system administration tasks.
Education:
- Bachelor's degree/University degree or equivalent experience
Benefits & conditions
$96,960.00 - $145,440.00
In addition to salary, Citi's offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.