GCP Architect

Net2Source
Sunnyvale, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Remote
Sunnyvale, United States of America

Tech stack

Amazon Web Services (AWS)
Azure
Software as a Service
Cloud Computing
Code Review
Computer Security
Software Design Patterns
Distributed Systems
Elasticsearch
Identity and Access Management
Virtual Private Networks (VPN)
OpenVPN
PCI Data Security Standards
Public Key Infrastructure
Role-Based Access Control
Management of Software Versions
Google Cloud Platform
Cloud Platform System
Istio
Amazon Web Services (AWS)
Terraform

Requirements

  • 8-12 years of infrastructure / platform engineering, with 3+ years as a principal-level technical authority on a production cloud platform
  • Deep GCP expertise - you have designed GCP organizations, multi-tenant GKE environments, VPC architectures, and IAM models for production workloads; you can defend design decisions in an Org Policy discussion as readily as a Terraform code review
  • Terraform mastery - multi-module design patterns, per-tenant factory modules, complex for each + dynamic blocks, state isolation strategy, module versioning; you have written Terraform that other engineers build on
  • ArgoCD at scale - ApplicationSets, multi-cluster agent/pull model, promotion gates, RBAC, HA- you have operated ArgoCD across 20+ clusters, not just installed it
  • Multi-tenant networking depth - CIDR management, IPAM tooling, VPC peering/PSC design, CGNAT or equivalent overlapping-address problem solving; you have solved customer CIDR conflict at scale
  • Security architecture - VPC Service Controls, Binary Authorization, Cloud KMS/CMEK, Workload Identity, IAP zero-trust, least-privilege IAM; you have designed the security model for a compliance-audited SaaS platform
  • Distributed systems intuition - you can evaluate trade-offs between Consul/Vault on VMs vs. containerized, between Elasticsearch and OpenSearch, between service mesh and no service mesh, and produce a written rationale that holds up under scrutiny
  • Strong written communication: architecture documents, decision records, and design
  • Distributed systems intuition - you can evaluate trade-offs between Consul/Vault on VMs vs. containerized, between Elasticsearch and OpenSearch, between service mesh and no service mesh, and produce a written rationale that holds up under scrutiny
  • Strong written communication: architecture documents, decision records, and design reviews are your primary output alongside code

Strong Plus:

  • HA VPN / OpenVPN architecture with per-tenant PKi at scale (cert lifecycle, rotation automation, GCP CA Service)
  • EU Sovereign Cloud experience: GCP Assured Workloads, AWS EU Sovereign, Azure EU, SecNumCloud, BSI C5, GDPR DPA design
  • HOK/BYOK with external KMS (Thales CipherTrust, HSM) - architectural experience, not just theoretical
  • Temporal.io workflow architecture for multi-step provisioning orchestration
  • Experience building agentic or Al-augmented infrastructure pipelines
  • SOC2 Type II, ISO 27001, or PCI-DSS architecture-to-controls mapping (you've been in the audit room)
  • Elasticsearch / OpenSearch cluster architecture at production scale
  • Google Cloud Professional Cloud Architect certification (required within 90 days if not already held)

Benefits & conditions

Own the GCP organization design end-to-end: folder hierarchy (Platform-Infrastructure, Customer-Hosting/Americas/EMEA/APAC, Engineering, OF, SE, PM, project naming conventions, IAM group model (sav-eic-* Google Groups - Least-privilege role bindings), and Organization Polloy framework (region constraints, external IP restrictions, SA key prevention, domain-restricted sharing, uniform bucket access)

Define and document the per-customer tenant isolation model: dedicated GCP project + VPC

  • GKE cluster per environment (prod/nonprod) - full billing, permission, and operational isolation. Own trade-off analysis between this model and namespace-level isolation as customer count grows

Resolve the critical open gaps in the current architecture: IPAM tooling selection, ArgoCD sharding strategy at 50-100+ clusters, PKIstrategy for SC2, Well-Architected Framework compliance gaps between MVP and production paths

Networking & CGNAT Architecture

Own the CGNAT (RFC 6598, 100.64.0.0/10) per-tenant addressing design: /23 CIDR

allocation framework (App /24, Web /25, Mgmt /26, GKE Master /28, PSA /24), IPAM tooling selection and integration into the provisioning pipeline

Design the full Connect 2.0 (SC2) architecture: HA OpenVPN topology (primary + secondary

VM per tenant, different zones, CGNAT side), PKIstrategy (GP CA Service Root CA + Issuing CA), per-tenant certificate lifecycle (generation, rotation, expiry alerting, revocation).

Firestore tenant config schema, Cloud Function orchestration (connect-health-probe, connect-failover, connect-failback), and . ovpn dual-endpoint bundle design

Define VPC routing Logic: custom node tags - active SC2 VI For RFC-1918 ranges, pod-traf

Apply for this position