Senior Systems Engineer (Production Support)
Role details
Job location
Tech stack
Job description
Location: Remote Department: Veterans Affairs (VA) Type: Full Time Min. Experience: Experienced Security Clearance Level: Public Trust (MBI) Military Veterans are highly encouraged to apply! Essential Duties and Responsibilities
-
Perform high-level, day-to-day operational support of complex application cloud systems
-
Develop solutions to routine technical problems of limited scope
-
Follow standard practices and procedures in analyzing situations or data from which answers can be readily obtained
-
Detect, isolate, document, rapidly report, and resolve system outages or problems encountered during operations of the scientific workstations, which includes the collections of diagnostic data, restoring the system operation, development of workarounds, and other activities necessary for recovery of a system
-
Accurately document problems in logging and discrepancy reporting tools
-
Work directly with the customer in most aspects of the day-to-day activities
-
Respond to user calls regarding hardware and software problems, correcting or ensuring that problems are escalated when required
-
Communicate with users and senior management the status of key problem statuses
-
Perform maintenance/installation of computing infrastructure
-
Implement continuous improvement methodology through the use of IT systems or procedure
-
Maintain inventory of system assets
-
Ensure compliance with VA standards and security policies
-
Provide documentation, training, and additional duties as assigned
Requirements
-
Experience with AWS and/or Azure Cloud, namely S3, Step Functions, Batch Jobs, and CloudWatch
-
Proven track record of managing and maintaining large-scale production systems
-
Strong proficiency in Linux/Unix and Windows operating systems
-
Experience with system administration tasks including user management, permissions, and system monitoring
-
Proficiency in scripting languages such as Python, Bash, or PowerShell for automation and configuration management
-
Experience with automation tools like Ansible, Puppet, Chef, or SaltStack
-
Knowledge of cloud-native technologies and infrastructure-as-code (IaC) tools such as Cloud Formation
-
Experience with monitoring tools like DynaTrace, Science Logic, and CloudWatch