Site Reliability Engineer
Role details
Job location
Tech stack
Job description
CGI is looking to hire a Site Reliability Engineer to ensure the stability, scalability, and resilience of our critical platforms-spanning mainframe, ETL pipelines, databases, and distributed systems. This role will be key to minimizing downtime, improving performance, and enabling reliable delivery of business-critical services. This role will be filled from the client site 5 days a week in one of the following locations: Pittsburgh PA, or Cleveland, OH only. Future duties and responsibilities . Support and maintain mission-critical applications developed in COBOL, DB2, Pega, VB .Net, and Java, including diagnosing and resolving application and database performance issues. . Monitor and maintain the health, performance, and reliability of large-scale Hadoop clusters and big data environments, ensuring optimal resource utilization and uptime. . Develop, automate, and optimize data pipelines using SQL, Python, and PySpark for efficient data ingestion, transformation, and processing. . Troubleshoot and resolve complex issues related to Informatica ETL processes, ensuring data quality, consistency, and timely delivery. . Implement and enforce best practices for site reliability, including automated monitoring, alerting, and incident response for both big data platforms and legacy systems. . Collaborate with development, QA, and infrastructure teams to support application deployments, upgrades, and integration across diverse technologies. . Document operational procedures, incident reports, and system configurations to support knowledge sharing and business continuity. . Continuously evaluate and recommend improvements for system scalability, security, and reliability in both big data and legacy application environments. . Ensure data security, governance, and compliance standards are met within all data engineering processes.
Requirements
Bachelor's degree or higher in Computer Science or equivalent preferred but not required. . Strong understanding of Site Reliability Engineering principles (SLIs, SLOs, SLAs) . Strong understanding of Incident management and root cause analysis (RCA) . Experience in Monitoring, alerting, and observability . Basic to intermediate understanding of Mainframe environments (z/OS, JCL, batch processing) . Ability to work with mainframe teams and understand batch job dependencies and failures . Working knowledge of relational databases (Oracle, SQL Server, DB2, Teradata, Hive, etc) . Experience with performance monitoring tools like Dynatrace, Splunk/ELK Desired Skills: . Experience with Medallion Architecture a plus . Experience in Containerization (Kubernetes / Docker) a plus . Experience with Marketing applications (Pega, Salesforce, Adobe, Zafin, Naehas) is a plus . Experience with Visual Basic (.NET) a plus . Experience with SaaS based solutions a plus . Financial Services industry knowledge., * Agile Delivery Management
- Incident Management
Benefits & conditions
CGI is required by law in some jurisdictions to include a reasonable estimate of the compensation range for this role. The determination of this range includes various factors not limited to skill set, level, experience, relevant training, and licensure and certifications. To support the ability to reward for merit-based performance, CGI typically does not hire individuals at or near the top of the range for their role. Compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range for this role in the U.S. is $58,800.00 - $156,700.00. CGI's benefits are offered to eligible professionals on their first day of employment to include: . Competitive compensation . Comprehensive insurance options . Matching contributions through the 401(k) plan and the share purchase plan . Paid time off for vacation, holidays, and sick time . Paid parental leave .Learning opportunities and tuition assistance . Wellness and Well-being programs