Lead Site Reliability Engineer

Midnite Logo

Midnite

πŸ“Remote - Australia

Summary

Join Midnite as the first Lead Site Reliability Engineer in Australia to establish and lead a new Site Operations team. You will be responsible for the 24/7 operational reliability of Midnite's systems, working closely with Platform and Backend Engineering teams. This critical leadership role focuses on scaling incident response, on-call capabilities, and site reliability engineering globally. You will define, build, and scale the function to ensure prompt incident detection, response, and resolution. This high-impact, high-autonomy role requires leading the charge during incidents and improving systems. You will be challenged and rewarded for your contributions to Midnite's success.

Requirements

  • Proven experience managing SRE or Ops-focused engineering teams in a high-availability production environment
  • Deep understanding of incident management, including building and leading effective on-call rotations
  • Calm under pressure: when incidents happen, you're the person who leads the charge, not just reacts
  • Strong systems thinking mindset; understands how infrastructure, code, and people interact under load
  • You've not only kept complex systems running - you've improved them. You likely build better tools than the ones you inherited, especially in Python
  • You believe Infrastructure as Code isn't optional - it's the default. Declarative, repeatable infrastructure management is in your DNA
  • A strong, informed stance on tooling and architecture - you're opinionated because you've earned it through experience, not guesswork
  • You've moved beyond legacy stacks - Ansible and EC2 are fine, but you know there's better, and you've implemented it
  • You come from the software side of the house - you've written, shipped, and supported production systems, and your approach to SRE is grounded in real engineering experience, not traditional IT ops
  • Experienced with cloud infrastructure (e.g., AWS, ECS, RDS), observability stacks (e.g., Datadog, CloudWatch, Sentry), and CI/CD pipelines
  • A collaborative, async-native communicator who thrives in distributed teams across time zones

Responsibilities

  • Define, build, scale and lead a Site Operations team of SREs and backend engineers
  • Design and implement a global on-call strategy (possibly a hybrid in-house and contractor mix)
  • Drive improvements in observability, alerting, incident management, and postmortem culture
  • Collaborate closely with the Platform and Backend teams to ensure production readiness, reliability goals, and smooth handoffs
  • Help hire talent in complementary time zones (e.g., Americas or APAC) to enable true 24/7 support
  • Define SLAs/SLOs in partnership with product and platform engineering
  • Own key infrastructure reliability KPIs and lead remediation efforts proactively
  • Lead operational reviews and manage the process for ongoing improvement in incident response

Preferred Qualifications

  • Experience working in betting, gaming, or other regulated real-time systems
  • Experience scaling globally distributed engineering functions

Benefits

  • Shape our future: Play a key role in our team's success, where your voice matters, and you'll have a direct impact on shaping Midnite's future
  • Connect and unwind: Take part in our quarterly gatherings where our community comes together to bond and have fun
  • Comprehensive health coverage: Look after your well-being with our outstanding zero-excess health insurance plan, which includes optical and dental coverage
  • Simplify life: Take advantage of our nursery salary sacrifice scheme, allowing you to conveniently pay your child's nursery fees straight from your paycheck
  • Work-life balance: Enjoy 25 paid holidays a year, plus generous paid maternity, paternity, and adoption leave, supporting you during life's most important moments
  • Productive home office: We provide everything you need for a comfortable and ergonomic home setup, ensuring you're as productive as possible
  • Flexible working: We embrace flexible working, allowing you to adjust your schedule when life's unexpected moments arise
  • Latest tech made easy: With our salary sacrifice schemes, you can upgrade to the latest gadgets, household items, and mobile tech without the upfront cost
  • Exclusive perks: Enjoy a wide range of discounts on retailers, groceries, and subscriptions, making life a little more affordable
  • Grow with us: Expand your skills through internal and external learning opportunities while benefiting from access to mentorship programs that support your development
  • Transparent compensation: We provide competitive pay with clear team bandings and salary grids, ensuring that salary discussions are simple and fair
  • Constructive feedback: We foster a transparent culture, encouraging individual feedback and review sessions to help everyone improve

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.