Senior Network Engineer

STN, inc.
Oakland, United States of America
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Oakland, United States of America

Tech stack

Border Gateway Protocol
Computer Clusters
Configuration Management
Complex Networks
Network Congestion
Data Centers
DDoS Mitigation
InfiniBand
Virtual Private Networks (VPN)
Multi-protocol Systems
Python
Network Monitoring
Routing
Open Shortest Path First
Overlay Transport Virtualization
Peering
Ansible
Virtual Local Area Networks
Wide Area Networks
Amazon Web Services (AWS)
Open Network Automation Platform
Cisco networks

Job description

The Senior Network Engineer designs, deploys, and operates the high-performance networking fabric supporting GPU clusters. This includes InfiniBand and RoCE fabrics for training workloads, customer-facing connectivity, and the wide-area network that connects STN sites and customer environments., * Design and configure InfiniBand or RoCE fabrics optimized for GPU training and distributed inference

  • Configure and operate switching, routing, and customer VLAN/VRF/VPC architectures
  • Manage BGP peering, public IP space, anycast, and DDoS protection
  • Design customer connectivity including cross-connects, dedicated links, VPN, and SD-WAN
  • Maintain network automation, configuration management, and source-of-truth tooling
  • Coordinate with the NOC on network monitoring, alerting, and runbook authoring
  • Troubleshoot complex network issues across layers 1 through 7
  • Maintain network documentation, diagrams, and operational runbooks
  • Drive network capacity planning aligned to fleet growth and customer commitments
  • Support security and compliance audits including SOC 2 and customer security reviews

Requirements

Do you have experience in Open Shortest Path First (OSPF) implementation?, * 7+ years in network engineering with data center or service provider experience

  • Deep expertise in InfiniBand or RoCE (RoCEv2), including congestion control and NCCL tuning
  • Strong knowledge of BGP, OSPF, MPLS, VXLAN, and EVPN
  • Hands-on experience with Arista, NVIDIA Mellanox/Spectrum, or Cisco platforms
  • CCIE, JNCIE, NCIE, or equivalent advanced certification strongly preferred, * GPU cluster networking experience at multi-thousand-GPU scale
  • SDN and automation skills (Ansible, Python, Nautobot, or Netbox)
  • Multi-site WAN and peering experience including IX participation

Familiarity with NVIDIA Cumulus, SONiC, or open networking stacks

Apply for this position