Lead Big Data Software Engineer (Databricks + AWS)
SoftServe, Inc.
yesterday
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Tech stack
Java
Airflow
Amazon Web Services (AWS)
Business Analytics Applications
Big Data
Data Warehousing
Database Queries
Github
Python
SQL Databases
Data Streaming
Workflow Management Systems
Data Storage Technologies
Data Ingestion
Snowflake
Spark
Data Lake
Apache Flink
Avro
Kafka
Data Management
Stream Processing
Data Pipelines
Redshift
Databricks
Job description
In this role, you will lead the design and development of scalable data platforms on AWS, with a strong focus on the Databricks ecosystem. You will guide a team of engineers, shape architectural decisions, and ensure high-quality delivery of both batch and streaming data solutions. You will work closely with business and technical stakeholders, contributing across the full project lifecycle, from discovery and design to production implementation, within a collaborative and innovation-driven environment., * Design and implement scalable data solutions using Databricks on AWS, including Lakehouse architectures based on Delta Lake and Unity Catalog
- Lead the development of batch and streaming data pipelines using technologies such as Apache Spark or Flink
- Define and own source-to-target mappings, supporting data ingestion and integration from multiple sources
- Drive architecture decisions and contribute to scaling the data platform and data models
- Collaborate with stakeholders to gather requirements and translate them into technical solutions and implementation plans
- Guide and support a team of data engineers, defining scope, priorities, and best practices
- Develop and manage workflows using orchestration tools such as Databricks Workflows, Apache Airflow, or MWAA
- Work with streaming platforms such as Apache Kafka, Amazon MSK, or Kinesis for real-time data processing
- Leverage data storage and analytics solutions such as Snowflake or Amazon Redshift
- Ensure proper documentation of data models, schemas, and architecture decisions
- Participate in the full project lifecycle, from PoC and MVP to full-scale implementation
- Explore new technologies, build prototypes, and contribute to knowledge sharing within the engineering community
Requirements
- Proven experience as a Lead Data Engineer with a strong focus on data pipeline design
- Hands-on experience with both batch and streaming data processing
- Strong expertise in AWS cloud platform and Databricks (including Delta Lake, Unity Catalog, Workflows, and Jobs)
- Proficiency in Python (preferred), Scala, or Java, along with strong SQL skills
- Solid knowledge of big data technologies such as Apache Spark or Flink
- Experience with workflow orchestration tools such as Databricks Workflows, Apache Airflow, or MWAA
- Practical experience with streaming platforms such as Apache Kafka, Amazon MSK, or Kinesis
- Familiarity with data warehousing solutions such as Snowflake or Amazon Redshift
- Experience working with data formats and tools such as Avro, GitHub, and SQL-based systems
- Ability to translate business requirements into technical solutions and guide teams toward delivery
- Strong leadership and communication skills, with experience working with cross-functional stakeholders
- Upper-intermediate or higher level of English