Lead Site Reliability Engineer

MongoDB Logo

MongoDB

💵 $147k-$289k
📍Remote - United States

Summary

Join MongoDB's Fabric team as a Site Reliability Engineer (SRE) Lead and lead a team in building and maintaining a robust, secure, and efficient multi-cloud network infrastructure. Leverage your expertise in networking, distributed systems, and automation to ensure system resilience, scalability, and reliability. This role requires 10+ years of experience in software and distributed systems, with deep networking expertise, and 2+ years of team management experience. The ideal candidate possesses a customer-focused mindset, values automation, and is familiar with modern cloud infrastructure. The position offers hybrid work accommodations or fully remote options within North America and a comprehensive benefits package.

Requirements

  • Have 10+ years of experience working on software and operating distributed systems, with deep expertise in networking fundamentals and a good understanding of how the internet works, e.g. TCP/IP (including IPv6), DNS, TLS/mTLS, BGP, tunnels, overlays, and SDN principles
  • Have 2+ years of experience managing engineering teams, fostering a positive team culture and handling career growth and performance conversations
  • Possess a customer-focused mindset, driving improvements that benefit end-users
  • Value efficiency in processes and operations, and display a strong preference for automation over manual processes (“allergic to ops work”)
  • Be intimately familiar with modern cloud-based infrastructure and the network design primitives of at least one of AWS, Azure, or GCP, e.g. VPCs, subnetting, routing, VPNs, peering, private link / private service connect, and CDNs
  • Have a strong knowledge of service mesh and load-balancing concepts, and be eager to implement these in a multi-cloud environment

Responsibilities

  • Lead a team of engineers, setting direction, removing blockers, and ensuring alignment with organizational goals
  • Oversee the development of a reliable and resilient multi-cloud globally-connected network that is crucial for MongoDB’s services
  • Collaborate with service-owning teams to provide internal support, addressing technical issues and offering guidance on best practices for service-to-service connectivity
  • Participate in a 24/7 on-call rotation to swiftly resolve issues related to network architecture and service-to-service connectivity, ensuring minimal disruption and high availability

Benefits

  • Flexible paid time off
  • 20 weeks fully-paid gender-neutral parental leave
  • Fertility and adoption assistance
  • 401(k) plan
  • Mental health counseling
  • Access to transgender-inclusive health insurance coverage
  • Health benefits offerings
  • Equity
  • Participation in the employee stock purchase program

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.