Data Engineer - GCP/Spark/Scala
Role details
Job location
Tech stack
Job description
In the assigned Job Role of Technology Consultant 2, your Area Of Responsibility will be as below:
Contribute to the requirements elicitation process by documenting assigned parts of business requirements, in line with guidance provided Facilitate software application design discussions, and document design decisions to guide the technical team towards building software solutions Participate in coding and integrate new features or updates into existing applications, with a focus on maintaining system stability Conduct code reviews, do changes to the codebase and maintain code repositories Implement test strategies, analyse results, and coordinate bug fixes to uphold the software quality standards Develop user training programs, documentation, and support frameworks to ensure a smooth transition to new software applications Actively participate in resolving production issues and recommend preventive strategies to enhance system reliability Maintain detailed records of code, testing techniques, and support activities to enrich the knowledge base and assist other similar projects, * Design, develop, and maintain scalable data pipelines on GCP.
- Build and optimize data processing workflows using BigQuery, Spark, and GCS.
- Develop and maintain ETL/ELT pipelines using Scala and Python.
- Orchestrate and schedule data workflows using Apache airflow.
- Write complex and optimized SQL queries for large scale datasets.
- Integrate and process data form multiple sources ensuring data quality and reliability.
- Implement and maintain CI/CD pipelines for automated deployment of data engineering workflows.
- Troubleshoot performance issues and optimize data processing jobs.
Requirements
A collaborative spirit and excellent communication skills. The ability to handle end to end SDLC phases from requirement gathering to implementation. A knack for translating complex requirements into actionable development tasks. A passion for design and hands-on coding experience A proactive approach to testing, troubleshooting, and refining our applications. The ability to work with cross-functional teams and do software integration., * Strong hands-on experience with GCP
- Expertise in BigQuery and Google Cloud Storage (GCS)
- Proficiency in Scala and/or Python for data engineering workflows.
- Strong experience with Apache Spark for large scala data processing.
- Experience with Apache Airflow for workflow orchestration.
- Advanced SQL skills for data analysis and transformation.
- Experience implementing CI/CD pipelines., * Strong analytical and problem-solving skills.
- Excellent communication and collaboration abilities.
- Microsoft certifications (e.g., Power BI Data Analyst Associate, Fabric Analytics Engineer) are a plus., * Bachelor's degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education.
- This position may require relocation and/or travel to work/project location.
- Candidates authorized to work for any employer in the United States without employer-based visa sponsorship are welcome to apply. Infosys is unable to provide immigration sponsorship for this role now or in the future.
Benefits & conditions
Along with competitive pay, as a full-time Infosys employee you are also eligible for the following benefits:
- Medical/Dental/Vision/Life Insurance
- Long-term/Short-term Disability
- Health and Dependent Care Reimbursement Accounts
- Insurance (Accident, Critical Illness , Hospital Indemnity, Legal)
- 401(k) plan and contributions dependent on salary level
- Paid holidays plus Paid Time Off