Platform Operations 1
Role details
Job location
Tech stack
Job description
At HDR, our employee-owners are fully engaged in creating a welcoming environment where each of us is valued and respected, a place where everyone is empowered to bring their authentic selves and novel ideas to work every day. As we foster a culture of inclusion throughout our company and within our communities, we constantly ask ourselves: What is our impact on the world?
Watch Our Story:' https://www.hdrinc.com/our-story'
Build, update, and maintain operational dashboards and alerting configurations for VCF platform services.
Monitor platform health, events, and service indicators to identify issues requiring escalation.
Support incident response activities by gathering logs, alerts, and system context for troubleshooting.
Assist with defining and tracking service performance metrics such as availability, latency, and incident response measures.
Maintain documentation for monitoring standards, alert thresholds, escalation procedures, and operational runbooks.
Participate in post-incident reviews and help document findings and corrective actions.
Support baseline performance reporting and assist with capacity trend analysis.
Help integrate observability and monitoring tools with enterprise platforms such as ITSM and reporting systems.
Follow operational, security, and compliance standards established for the platform environment.
Schedule & Presence: This on-site role supports 24/7 operations through real-time collaboration, standard shifts occur within a 6:00 AM - 6:00 PM window, Monday through Friday. Additionally, this position requires scheduled on-call flexibility and the ability to remain reasonably reachable during off-hours for critical business continuity.
Requirements
Exposure to VMware Cloud Foundation, vRealize Operations, or VCF Operations.
Familiarity with Dynatrace, Splunk, Grafana, Azure Monitor, or similar observability tools.
Exposure to ServiceNow or other ITSM platforms.
Understanding of service level concepts such as SLOs, SLAs, and MTTR.
Awareness of security monitoring, access controls, and audit requirements in enterprise infrastructure.
Required Qualifications
Bachelor's degree in Information Technology, Computer Science, Engineering, or related field, or equivalent practical experience.
Minimum 1 year of experience in infrastructure operations, monitoring, systems administration, or platform support.
Foundational experience with infrastructure monitoring, alerting, or observability tools.
Basic understanding of incident management concepts and service support processes.
Exposure to VMware vSphere administration or virtualized infrastructure environments.
Basic understanding of scripting or automation concepts using PowerShell, Python, or similar tools., If you are required to drive for us, we require a valid driver's license and compliance with our vehicle policy.