Production Support Engineer III
Truist Inc
Atlanta, United States of America
2 days ago
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Atlanta, United States of America
Tech stack
IBM AIX
Amazon Web Services (AWS)
Software Applications
Azure
Computer Engineering
IBM DB2
Linux
Microsoft SQL Server
Openshift
Oracle Applications
Ansible
SharePoint
Shell Script
Software Engineering
Cloud Platform System
Gitlab
Kubernetes
Information Technology
Performance Monitor
Cloudwatch
Splunk
Dynatrace
Docker
ServiceNow
Job description
- Ensure the operational integrity, availability, and performance of mission-critical systems.
- Manage technical incidents, troubleshoot recurring issues, and implement permanent solutions to maintain system stability.
- Collaborate with cross-functional teams to resolve incidents efficiently and improve system resiliency through proactive monitoring and automation.
- Handle the identification, triage, and resolution of medium-to-high priority incidents with minimal supervision to ensure business operations are minimally impacted.
- Collaborate with development teams, business partners, and other stakeholders to diagnose and resolve technical issues, implementing long-term fixes to prevent incident recurrence.
- Use monitoring tools (e.g., Splunk, Dynatrace, CloudWatch) to detect performance issues and execute corrective actions promptly.
- Enhance system observability to proactively detect issues and improve overall system performance and stability.
- Develop and maintain automation scripts to streamline routine production support tasks, reducing manual interventions.
- Implement automation strategies to improve production stability and minimize downtown.
- Maintain clear and detailed documentation of troubleshooting procedures, contributing to the shared knowledge base.
- Provide assistance in improving the incident, problem, and change management processes, following ITIL best practices.
- Participate in root cause analysis and suggest process improvements to enhance system stability and performance.
- Collaborate with cross-functional teams in resolving recurring production support issues and optimizing workflows.
- Actively mentor junior support engineers, fostering technical growth within the team.
Requirements
- Must have Bachelor's degree in Computer Science, Computer Engineering, CIS or related technical field.
- Must have 6 years of progressive experience in production support positions performing the following:
- Managing incident management, triage, and production support functions for both on-premise and cloud environments.
- Proficiency with IT Service Management (ITSM) tools such as ServiceNow, and familiarity with incident, problem, and change management processes.
- Understanding of infrastructure, application technology stacks, and the software development lifecycle.
- Utilizing experience with: Dynatrace, Splunk, CloudWatch, DB2, SQL Server, Oracle, Microsoft Azure, SharePoint Development, AWS, OpenShift, Kubernetes, GitLab, Ansible, Shell script, Linux & AIX, IBM PowerHA, and Docker.
- Position may be eligible to work hybrid/remotely but is based out of and reports to Truist offices in Atlanta, GA. Must be available to travel to Atlanta, GA regularly for meetings and reviews with manager and project teams within 24-hours' notice.