Infrastructure Operations Engineer

BridgePhase, LLC
Portland, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Portland, United States of America

Tech stack

Agile Methodologies
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Command-Line Interface
Cloud Computing Security
Cloud Engineering
Configuration Management
Computer Networks
Continuous Integration
Disaster Recovery
DNS
Drupal
Monitoring of Systems
Identity and Access Management
Subnetting
Virtual Private Networks (VPN)
Routing
Performance Tuning
Ansible
Software Vulnerability Management
Web Applications
Datadog
Data Logging
Delivery Pipeline
Amazon Web Services (AWS)
Cloudformation
Amazon Web Services (AWS)
Containerization
Gitlab-ci
Kubernetes
Route53
Cloudwatch
Terraform
Splunk
New Relic (SaaS)
Devsecops
Docker
ELK
Jenkins

Job description

infrastructure operations expertise with hands-on application and platform troubleshooting. The ideal candidate thrives in production environments, excels at incident response, and partners closely with development, DevSecOps, and security teams to maintain system stability, performance, and compliance. This is a remote position. In this position, you can expect to:

  • Provide Tier 3 support for complex infrastructure and application-related incidents
  • Monitor system health, performance metrics, application logs, and infrastructure telemetry
  • Troubleshoot and resolve production issues across AWS infrastructure and Drupal-based platforms
  • Support AWS cloud services including compute, storage, networking, and security components
  • Investigate and diagnose performance bottlenecks, resource constraints, and configuration issues
  • Support CI/CD pipeline operations and troubleshoot deployment or release failures
  • Perform root cause analysis for recurring incidents and implement preventive measures
  • Coordinate incident response and resolution with development, DevSecOps, security, and infrastructure teams
  • Execute routine maintenance tasks including patching, scaling, backups, and system updates
  • Support deployment activities and release verification in production environments
  • Manage user support tickets and ensure timely resolution within SLA requirements
  • Maintain and update technical documentation for operational procedures and known issues
  • Implement and maintain monitoring alerts, logging, and automated health checks
  • Support disaster recovery testing and business continuity planning
  • Ensure compliance with federal security requirements and audit controls
  • Interface with federal stakeholders on operational status, issue escalation, and resolution
  • Collaborate with AWS support and third-party vendors for escalated technical issues, platform, combining deep infrastructure operations expertise with hands-on application and platform troubleshooting. The ideal candidate thrives in production environments, excels at incident response, and partners closely with development, DevSecOps, and security teams to maintain system stability, performance, and compliance. This is a remote position. In this position, you can expect to:
  • Provide Tier 3 support for complex infrastructure and application-related incidents
  • Monitor system health, performance metrics, application logs, and infrastructure telemetry
  • Troubleshoot and resolve production issues across AWS infrastructure and Drupal-based platforms
  • Support AWS cloud services including compute, storage, networking, and security components
  • Investigate and diagnose performance bottlenecks, resource constraints, and configuration issues
  • Support CI/CD pipeline operations and troubleshoot deployment or release failures
  • Perform root cause analysis for recurring incidents and implement preventive measures
  • Coordinate incident response and resolution with development, DevSecOps, security, and infrastructure teams
  • Execute routine maintenance tasks including patching, scaling, backups, and system updates
  • Support deployment activities and release verification in production environments
  • Manage user support tickets and ensure timely resolution within SLA requirements
  • Maintain and update technical documentation for operational procedures and known issues
  • Implement and maintain monitoring alerts, logging, and automated health checks
  • Support disaster recovery testing and business continuity planning
  • Ensure compliance with federal security requirements and audit controls
  • Interface with federal stakeholders on operational status, issue escalation, and resolution
  • Collaborate with AWS support and third-party vendors for escalated technical issues

Requirements

As with any technical environment, the exact role responsibilities will evolve with the changing needs of our client. We are seeking versatile candidates who thrive on new challenges and can readily adapt to additional responsibilities beyond those listed above. Preferred Experience and Qualifications:

  • At least 8 years of total professional experience with 5+ years in infrastructure operations, cloud engineering, or production support roles
  • Prior or current experience supporting government programs (GovCon experience required)

Strong technical knowledge and expertise in:

  • AWS core services (EC2, S3, RDS, VPC, ELB/ALB, CloudFront, Route53)
  • Cloud security services (IAM, Security Groups, KMS, CloudTrail, GuardDuty)
  • Infrastructure monitoring and observability (CloudWatch, Datadog, New Relic, or similar)
  • Infrastructure as Code (Terraform, CloudFormation, Ansible)
  • CI/CD pipeline operations (Jenkins, GitLab CI, AWS CodePipeline)
  • Linux/Unix system administration and command-line tools
  • Networking concepts (VPCs, subnets, routing, VPNs, DNS)
  • Log aggregation and analysis (CloudWatch Logs, ELK stack, Splunk)
  • Container technologies (Docker, ECS, EKS, Kubernetes)

Demonstrated ability to:

  • Production incident management and escalation
  • Troubleshoot complex issues under pressure in live environments
  • Perform root cause analysis and implement long-term fixes
  • Support security incident response and vulnerability remediation
  • Execute change management and configuration control
  • Maintain clear, accurate technical documentation
  • Work within ITIL or similar service management frameworks

Working knowledge and familiarity with:

  • Federal security and compliance requirements (FedRAMP, FISMA, NIST)
  • DevSecOps practices and automation tooling
  • Backup, recovery, and disaster recovery procedures
  • Web application architecture and performance optimization
  • Database operations, backup/restore, and performance tuning
  • Agile development and operations methodologies
  • SLA management, KPIs, and operational reporting

Nice to Have (Strong Plus):

  • Hands-on Drupal experience, including operational support, troubleshooting, or performance optimization
  • AWS infrastructure engineering experience beyond operations (design, modernization, or large-scale cloud migrations)
  • Experience supporting enterprise-scale or mission-critical information sharing platforms
  • AWS certifications (Solutions Architect, SysOps Administrator, Security Specialty)

Benefits & conditions

While we've outlined our ideal candidate, we recognize that talent comes in many forms. If you don't check every box but possess a strong design aptitude, a passion for creating exceptional user experiences, and a drive to learn and grow, we strongly encourage you to apply. We value designers who demonstrate curiosity, adaptability, and a solid foundation in user-centered design principles. If you're excited about the challenge of enhancing digital experiences for government agencies and are willing to dive into new technologies and methodologies, we want to hear from you. Our team thrives on diverse perspectives and experiences. About Our Company: At BridgePhase, our values shape our culture and guide our actions. We act with integrity, honesty, and respect-earning trust and fostering collective success. We are critical thinkers and problem solvers, driving innovation and positive disruption to solve hard challenges at speed and scale. Our work is characterized by courage, compassion, commitment, and teamwork. We apply disciplined engineering principles and a proven agile approach to deliver flexible, simplified, durable, and high-performing solutions with lasting impact. Additionally, we invest in our communities through strategic charitable initiatives, empowering our employees to make meaningful contributions to causes they are passionate about. Our Benefits: We pride ourselves on providing top-tier benefits that rival those found in larger organizations. Some of the perks our team enjoys include:

  • Competitive compensation that reflects your skills and impact

  • Multiple bonus programs rewarding performance, company growth, and employee referrals

  • Flexible PTO with 20 days to use when you need them

  • All federal holidays paid to help you truly recharge

  • Paid sick leave because health always comes first

  • 100% paid parental leave

  • 401(k) with 6% match and no vesting period

  • Top-tier medical, dental, and vision plans with low out-of-pocket costs

  • Short- and long-term disability and life insurance included

  • Pet insurance to support your four-legged family

  • Annual professional development budget for training, certifications, and conferences

  • Two paid community service days for causes that matter to you

  • Social pod budget to connect with teammates wherever you live, While we've outlined our ideal candidate, we recognize that talent comes in many forms. If you don't check every box but possess a strong design aptitude, a passion for creating exceptional user experiences, and a drive to learn and grow, we strongly encourage you to apply. We value designers who demonstrate curiosity, adaptability, and a solid foundation in user-centered design principles. If you're excited about the challenge of enhancing digital experiences for government agencies and are willing to dive into new technologies and methodologies, we want to hear from you. Our team thrives on diverse perspectives and experiences. About Our Company: At BridgePhase, our values shape our culture and guide our actions. We act with integrity, honesty, and respect-earning trust and fostering collective success. We are critical thinkers and problem solvers, driving innovation and positive disruption to solve hard challenges at speed and scale. Our work is characterized by courage, compassion, commitment, and teamwork. We apply disciplined engineering principles and a proven agile approach to deliver flexible, simplified, durable, and high-performing solutions with lasting impact. Additionally, we invest in our communities through strategic charitable initiatives, empowering our employees to make meaningful contributions to causes they are passionate about. Our Benefits: We pride ourselves on providing top-tier benefits that rival those found in larger organizations. Some of the perks our team enjoys include:

  • Competitive compensation that reflects your skills and impact

  • Multiple bonus programs rewarding performance, company growth, and employee referrals

  • Flexible PTO with 20 days to use when you need them

  • All federal holidays paid to help you truly recharge

  • Paid sick leave because health always comes first

  • 100% paid parental leave

  • 401(k) with 6% match and no vesting period

  • Top-tier medical, dental, and vision plans with low out-of-pocket costs

  • Short- and long-term disability and life insurance included

  • Pet insurance to support your four-legged family

  • Annual professional development budget for training, certifications, and conferences

  • Two paid community service days for causes that matter to you

  • Social pod budget to connect with teammates wherever you live

About the company

BridgePhase is a software engineering company focused on designing, building, securing, and operating cutting-edge software solutions that drive mission success and operational excellence for Federal Government organizations. Our mission is to empower our clients and employees to realize their potential, achieve amazing results, and advance the mission of our Federal Government. We do this by providing an environment that fosters growth, innovation, collaboration, and delivery excellence needed to achieve successful and lasting IT modernization. With BridgePhase, federal agencies gain a trusted partner dedicated to delivering high-performing solutions that advance the nation's most critical objectives. BridgePhase is seeking an Infrastructure Operations Engineer to join our team supporting the Department of Homeland Security (DHS). This role provides Tier 3 operational support across both AWS cloud infrastructure and an enterprise Drupal-based information sharing platform, combining deep, BridgePhase is a software engineering company focused on designing, building, securing, and operating cutting-edge software solutions that drive mission success and operational excellence for Federal Government organizations. Our mission is to empower our clients and employees to realize their potential, achieve amazing results, and advance the mission of our Federal Government. We do this by providing an environment that fosters growth, innovation, collaboration, and delivery excellence needed to achieve successful and lasting IT modernization. With BridgePhase, federal agencies gain a trusted partner dedicated to delivering high-performing solutions that advance the nation's most critical objectives. BridgePhase is seeking an Infrastructure Operations Engineer to join our team supporting the Department of Homeland Security (DHS). This role provides Tier 3 operational support across both AWS cloud infrastructure and an enterprise Drupal-based information sharing

Apply for this position