Fleet Monitoring And Analysis Engineer

CoreWeave
Summary
Join CoreWeave's Fleet Monitoring & Analysis Team as an Engineer to contribute to the automated provisioning and management of its expanding hardware fleet. You will design and implement solutions for large-scale server observability, adapt open-source solutions, generate reports and visualizations, and create test plans and automation. This role involves working with a team of engineers focused on managing high-performance hardware at scale and participating in on-call rotation. The ideal candidate has 2+ years of experience in software or infrastructure engineering, experience with automation and orchestration, and experience implementing metrics collection and alerting. CoreWeave offers a competitive salary, comprehensive benefits, and a hybrid work environment.
Requirements
- Have 2 or more years experience in a software or infrastructure engineering industry
- Have experience in the domains of automation and orchestration workflows and are knowledgeable about server hardware, components, and related technologies and strategies for the management of physical infrastructure at scale
- Have experience implementing metrics collection and alerting on standard platforms
- Believe in the value of automation and will champion practices that drive reliability and prioritize the CoreWeave customer experience
- Have work authorization that does not require sponsorship from the company now or in the future
Responsibilities
- Design and implement solutions to large-scale server observability to continually improve the stability of CoreWeave’s global hardware fleet
- Adapt, extend, and implement open-source solutions to augment the depth and breadth of our visibility into our operating environment
- Generate and maintain custom reports, alarms, and visualizations to help teams understand and respond to our growth and changes
- Create test plans, deployment automation, dashboards, alerts, and insights into our fleet operations, as well as participate in the Fleet Engineering Developers’ on-call rotation
Preferred Qualifications
Bring your own diversified experiences to our teams – even if you aren't a 100% skill or experience match
Benefits
- Medical, dental, and vision insurance - 100% paid for by CoreWeave
- Company-paid Life Insurance
- Voluntary supplemental life insurance
- Short and long-term disability insurance
- Flexible Spending Account
- Health Savings Account
- Tuition Reimbursement
- Mental Wellness Benefits through Spring Health
- Family-Forming support provided by Carrot
- Paid Parental Leave
- Flexible, full-service childcare support with Kinside
- 401(k) with a generous employer match
- Flexible PTO
- Catered lunch each day in our office and data center locations
- A casual work environment
- A work culture focused on innovative disruption
- Hybrid workplace, offering employees flexibility in how they structure their time between in-office and remote work