Senior Site Reliability Engineer

Xero
Summary
Join Xero's Product SRE team as a Senior Engineer and contribute to the company's Product SRE strategy and the transformation of its SRE culture. You will work with highly experienced Site Reliability Engineers, build relationships with product engineering teams, and build a culture of continuous improvement to ensure product reliability. Responsibilities include contributing to daily deliverables, building long-term relationships with engineering teams, creating and monitoring quality standards, assisting with training, and participating in a 24/7 on-call roster. The ideal candidate will have a strong software engineering and SRE background, experience mentoring engineers, a passion for customer experience, and broad technical understanding of modern cloud technologies. Xero offers generous paid leave, wellbeing programs, health insurance, life insurance, income protection, parental leave, an employee share plan, flexible working, and career development.
Requirements
- Strong software engineering and hands-on SRE background, with experience of leading initiatives in a highly technical team
- Proven experience mentoring engineers in a fast growing company
- Obsessed with delivering a high quality and highly stable customer experience. Passion for customer-first thinking, with a strong product mindset helping to understand and anticipate customer needs
- Broad and deep technical understanding of modern cloud technologies (AWS, Azure, GCP) and their incident and problem management practices, particularly high-growth, high-availability SaaS-based transactional systems
- Proficiency in one or more object-oriented programming languages (C#, JavaScript, Java, Python etc) or experience with infrastructure-as-code (e.g. Terraform, Cloudformation)
- Experience using observability tooling to monitor the health of a highly distributed system
Responsibilities
- Contribute to the completion of the day to day deliverables of a dedicated product SRE team
- Build long term relationships with product engineering teams, ensuring everyone can deliver on system reliability with a theme of continuous improvement
- Build a culture of continuous improvement to ensure product reliability is continuously improving and impact of issues are reduced; create and actively monitor quality standards for SRE teams and report regularly on its adherence
- Assist with ongoing training across the business to ensure reliability requirements are well understood and incorporated into product designs
- Participate in a 24/7 global on call roster, focusing on incident response and remediation
Preferred Qualifications
- Any experience with reliability concepts such as: capacity management, autoscaling, safe deployment and releases, software strategies for reliability, fault tolerance, and graceful failure would be highly beneficial
- Understanding of human factors, safety science, and resilience engineering are also valuable
Benefits
- Offering very generous paid leave to use however youβd like (plus statutory holidays!)
- Dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family
- Health insurance
- Life insurance
- And income protection
- Wellbeing and sports programmes
- Employee resource groups
- 26 weeks of paid parental leave for primary caregivers
- An Employee Share Plan
- Beautiful offices
- Flexible working
- Career development
- And many other benefits that reflect our human value