Network Architect AI Infrastructure & Data Center Networks
Role details
Job location
Tech stack
Job description
We are seeking a highly experienced Senior Network Architect to lead the design, architecture, and evolution of large-scale AI/ML, data center, and backbone network infrastructure. The ideal candidate will have deep expertise in high-performance networking, multi-terabit WAN architectures, EVPN/VXLAN fabrics, network automation, and cloud-scale infrastructure supporting AI workloads. Key Responsibilities Design and architect large-scale AI/ML data center networks and high-capacity WAN infrastructure. Lead deployment of EVPN/VXLAN fabrics supporting GPU clusters and AI training environments. Drive network scalability, reliability, performance, and automation initiatives across global infrastructure. Design and optimize low-latency, high-throughput networks supporting RDMA/RoCE workloads. Develop network automation solutions using Python, Ansible, Terraform/OpenTofu, and CI/CD pipelines. Define network standards, operational processes, observability frameworks, and reliability best practices. Collaborate with infrastructure, cloud, systems, and AI engineering teams on strategic architecture initiatives. Lead troubleshooting and performance optimization for large-scale production environments. Mentor engineers and contribute to technical leadership, documentation, and architecture reviews.
Requirements
15+ years of experience in Network Architecture, Network Engineering, or Network Reliability Engineering. Deep expertise with: BGP, OSPF, IS-IS, MPLS EVPN/VXLAN Data Center Networking WAN and Backbone Architecture AI/ML Infrastructure Networking Network Performance and Capacity Planning Strong experience with Juniper, Arista, Cisco, and multi-vendor environments. Hands-on experience with Linux administration and network automation. Strong scripting/programming skills in Python, Go, Bash, or similar languages. Experience with Infrastructure-as-Code and automation frameworks (Ansible, Terraform/OpenTofu, Pulumi). Experience building highly available, scalable cloud and data center networks. Preferred Qualifications Experience supporting AI training clusters, GPU fabrics, or HPC environments. Knowledge of PTP, RDMA, RoCEv2, and low-latency networking technologies. Experience with network observability platforms such as Kentik, ThousandEyes, Zabbix, Nagios, or similar. Exposure to AWS, Google Cloud Platform, and hybrid cloud networking architectures. Experience leading architecture reviews and cross-functional infrastructure programs. Nice to Have Experience with large-scale hyperscaler environments. Participation in industry organizations such as NANOG, RIPE, or Internet Society. Background supporting multi-terabit AI or research infrastructure environments.