Hadoop Architect
Role details
Job location
Tech stack
Job description
-
Define target state architecture and strategic roadmap for surveillance data processing platforms leveraging Hadoop,Spark, Pyspark and Python
-
Establish standard architectural patterns for alert generation, enrichment, aggregation and reporting pipelines.
-
Ensure architecture aligns with enterprise technology standards and Surveillance control expectation
-
Solution Design and Delivery Support
-
Translate surveillance business requirements(e.g, market misconduct detection, regulatory coverage) into scalable technical designs
-
Review and approve detailed technical designs, ensuring alignment with functional intent, regulatory requirements, and architectural standards.
-
Provide hands-on architectural guidance to engineering teams during development, testing and implementation
-
Design and standardization of Spark/Pyspark frameworks supporting surveillance alert generation and enrichment
-
Modernization and Optimization of large-scale Hadoop surveillance workloads to improve performance, stability and control coverage
-
Implementation of enterprise-consistent architecture patterns enabling audit readiness, lineage, and regulatory traceability
-
Support end-to-end delivery across the SDLC, minimizing rework and technical debt
-
Architecture Governance & Leadership
-
Participate in and lead architecture review forums and design walkthroughs
-
Mentor senior engineers and promote adoption of Standard framework and best practices
-
Influence enterprise surveillance platform strategy through architecture governance
-
Documentation & Communication
-
Maintain high-quality documentation, including business requirement, data definitions and process flows
-
Proactively communicate risks, dependencies and potential timeline impact to stakeholders
-
Non-Functional requirement and product Readiness
Requirements
Must Have Technical/Functional Skills
Primary Skill: Data Engineering, Platform Engineering or architecture roles. Deep Expertise in Pyspark.
Experience: 10+ yrs
Roles & Responsibilities
Bachelor s or master s degree in computer science or related field.
-
Required Hard and Soft Skills / Experience
-
Deep Expertise in PySpark, including performance tuning and optimization
-
Strong python development experience in large-scale distributed environment
-
Solid knowledge of Hadoop ecosystem (HDFS,Hive/Impala, YARN)
-
Proven experience designing and governing enterprise, regulatory facing data platforms.
-
Expertise in designing data lakes, ELT/ETL pipelines, batch and real time data processing solution
-
Proficiency in programming languages such as Java, Scala and SQL
-
Strong understanding of non-functional requirements and production support models
-
Clear written and verbal communication skills with ability to influence across organizations
-
Preferred Skills / Experience
-
Financial services experience, particularly in Market Surveillance, AML, Fraud, Or Risk Technology
-
Experience supporting regulatory or audit facing platforms
-
Kafka and Spark Structured streaming exposure
-
Familiarity with Orchestration tools(Airflow,Control-M,Oozie)
-
Knowledge of data governance, lineage, and data quality controls