Staff Network Engineer

Voltage Park Logo

Voltage Park

πŸ“Remote - Worldwide

Summary

Join Voltage Park as a Staff Network Engineer and contribute to building and operating the backbone of a high-performance AI infrastructure. You will design, deploy, and support large-scale network systems connecting GPU clusters, storage, and compute environments across data centers. Collaborate with Principal Engineers and cross-functional teams to deliver automation-driven, low-latency networking for AI and HPC workloads. Implement and maintain high-throughput, low-latency networks, deploy and troubleshoot network systems, and operate and optimize layer 2/3 network services. Develop and maintain network automation, monitor network health, participate in incident response, and maintain configuration standards. Collaborate on architectural decisions and vendor evaluations.

Requirements

  • 5–8+ years of hands-on experience in large-scale network engineering, data center networks, or service provider infrastructure
  • Strong knowledge of IP networking, BGP, OSPF, EVPN/VXLAN, and L2/L3 design principles
  • Experience configuring and operating Arista, Juniper, or Cisco platforms in production environments
  • Proficiency in scripting or automation (e.g., Python, Bash, Ansible)
  • Solid troubleshooting skills and experience with real-time diagnostics and packet analysis
  • Familiarity with monitoring and telemetry tools (e.g., Prometheus, Grafana, sFlow, InfluxDB)

Responsibilities

  • Implement and maintain high-throughput, low-latency networks supporting AI Factory workloads and distributed training infrastructure
  • Work hands-on to deploy, configure, and troubleshoot routing, switching, optics, and interconnect systems across data centers
  • Operate and optimize layer 2/3 network services: BGP, EVPN/VXLAN, OSPF, MPLS, QoS, and ACLs
  • Work with Infiniband Networking Systems and Nvidia Fabric Manager (UFM)
  • Develop and maintain network automation (e.g., Ansible, Python, Terraform) for provisioning, compliance, and operational workflows
  • Monitor network health and performance using telemetry tools and help scale observability platforms
  • Participate in the incident response rotation and perform root cause analysis on service-impacting events
  • Maintain configuration standards, documentation, and change management in line with infrastructure governance processes
  • Collaborate with the Principal Network Engineer on architectural decisions and vendor evaluations

Preferred Qualifications

  • Experience in AI, HPC, or GPU-based infrastructure
  • Exposure to carrier-grade architectures, DCI, and optical transport systems
  • Exposure to Nvidia Infiniband Networking systems and components
  • Understanding of network segmentation, security policies, and zero-trust principles
  • Comfortable working in 24/7 operational environments and on-call rotations

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs