Infrastructure Support and Systems Operations Manager

CoreWeave
Summary
Join CoreWeave, a leader in specialized cloud infrastructure, as a versatile manager overseeing Infrastructure Support and Systems Operations teams. This key role involves managing critical physical infrastructure supporting high-performance compute needs. You will lead teams ensuring smooth hardware operations, build a new support group for dedicated infrastructure solutions, and oversee incident resolution. Responsibilities include improving operational efficiency, collaborating cross-functionally, managing client communication, and developing team members. The ideal candidate possesses strong leadership and technical skills, experience in physical infrastructure support, and excellent communication abilities. CoreWeave offers a competitive salary and benefits package, including comprehensive health insurance, paid time off, and a hybrid work environment.
Requirements
- 3+ years experience leading teams focused on physical infrastructure support and incident resolution
- Knowledge of Linux environments and basic networking
- Strong understanding of server hardware, configuration, and troubleshooting
- Excellent communication skills for collaboration across technical and non-technical teams
- Strong organizational and project management skills
- Applicants must have work authorization that does not require sponsorship from the company now or in the future
Responsibilities
- Manage Systems Operations Team: Lead a skilled team responsible for maintaining and optimizing physical infrastructure across multiple client environments
- Build and Lead a Dedicated Infrastructure Support Team: Develop and manage a team focused on supporting key infrastructure, handling escalations, and ensuring smooth hardware operations
- Incident and Escalation Management: Oversee the resolution of infrastructure-related incidents, collaborating with internal teams to deliver effective solutions
- Operational Efficiency: Improve support processes to enhance efficiency and reduce downtime, ensuring the infrastructure meets client expectations
- Cross-functional Collaboration: Work closely with product, infrastructure, and other teams to ensure seamless delivery of infrastructure resources
- Client Communication: Manage client communication during escalations and issue resolution, ensuring transparency and client satisfaction
- Team Leadership and Development: Mentor team members, developing their skills to manage and maintain critical infrastructure effectively
Preferred Qualifications
- Experience managing technical operations or infrastructure support teams in cloud or data center environments
- Familiarity with distributed computing environments, networking, and storage infrastructure
- Experience with NVIDIA GPU technologies
- Knowledge of Kubernetes, Slurm, and Bright Cluster Manager technologies
Benefits
- Medical, dental, and vision insurance - 100% paid for by CoreWeave
- Company-paid Life Insurance
- Voluntary supplemental life insurance
- Short and long-term disability insurance
- Flexible Spending Account
- Health Savings Account
- Tuition Reimbursement
- Mental Wellness Benefits through Spring Health
- Family-Forming support provided by Carrot
- Paid Parental Leave
- Flexible, full-service childcare support with Kinside
- 401(k) with a generous employer match
- Flexible PTO
- Catered lunch each day in our office and data center locations
- A casual work environment
- A work culture focused on innovative disruption
- At CoreWeave, we are committed to operating as a hybrid workplace, offering employees flexibility in how they structure their time between in-office and remote work
Share this job:
Similar Remote Jobs
