HPC NVLink Operations Engineer

CoreWeave Logo

CoreWeave

πŸ’΅ $80k-$120k
πŸ“Remote - United States

Summary

Join CoreWeave, a leading AI hyperscaler, as an NVLink operations engineer supporting large-scale data center deployments. You will be responsible for the deployment and lifecycle management of NVLink systems, diagnosing and resolving performance issues, collaborating with global teams and customers, and ensuring 24/7 support. This role requires a basic understanding of networking fundamentals, experience in troubleshooting network and server hardware, Linux system administration, and excellent communication skills. CoreWeave offers a competitive salary, comprehensive benefits including medical, dental, vision, life insurance, disability insurance, 401k, flexible PTO, and more. The company prioritizes a hybrid work environment, with remote work considered for candidates located far from an office. CoreWeave is committed to fostering an inclusive and supportive workplace.

Requirements

  • Basic understanding of networking fundamentals
  • Experienced in troubleshooting network and server hardware at the component level
  • Linux system administration
  • Ability to troubleshoot and debug complex application issues
  • Excellent communication and collaboration skills

Responsibilities

  • Support the deployment of NVLink systems across large data center environments
  • Support the full lifecycle management of NVLink hardware and software components
  • Diagnose and resolve performance, connectivity and stability issues in complex environments
  • Collaborate with internal teams and external customers worldwide
  • Participate in a rotating on-call schedule to ensure 24/7 support coverage

Preferred Qualifications

  • Experience working in large-scale environments (1,000+ switches or nodes)
  • Familiarity with Ansible
  • Understanding of Redfish API for system management
  • Experience with NVUE (NVIDIA User Experience) or similar network based CLI
  • Experience with Grafana/PromQL
  • Proficiency in at least one language (e.g., Python, Go)

Benefits

  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs