Summary

Join Owner.com, a rapidly growing restaurant-commerce platform, as a Senior SRE/DevOps Engineer. You will play a crucial role in ensuring the reliability and scalability of our systems, working on site-reliability engineering and DevOps enablement. This position involves designing for uptime, performance, and resiliency, as well as building tools, CI/CD pipelines, and automation. You will collaborate with various engineering teams and contribute to incident response and post-mortems. Your work will directly impact thousands of restaurants and millions of diners daily. The role offers a competitive salary, comprehensive benefits, and a remote-first work environment.

Requirements

5+ years running production workloads on AWS (or GCP/Azure) with infrastructure-as-code (Terraform/CDK/CloudFormation)
Hands-on experience operating container orchestration (ECS, EKS, Kubernetes, Nomad, etc.) and designing blue/green or canary rollouts
Depth in at least two of our core datastores (Postgres, MongoDB, Kafka) including backup/restore, upgrades, and performance tuning
Fluency with CI/CD pipelines (we use Buildkite + GitHub Actions) and a knack for automating everything with shell, Python, or TypeScript
Proven track record setting up monitoring/alerting in Datadog, Prometheus, or similar, with clear SLO/SLA ownership
Strong grasp of linux networking, load balancing (Cloudflare/ELB), and CDN/edge-security concepts
Excellent incident-management and root-cause analysis skills; able to write crisp RCAs and follow through on action items
Passion for customer-centric thinking, rapid iteration, and continuous learning

Responsibilities

Design for reliability: Set SLOs/SLIs, build self-healing architectures, and drive incident-prevention projects that keep our APIs and real-time ordering flows <100 ms p95
Own observability: Level-up dashboards, alerts, and distributed tracing so teams can detect issues before customers do
Automate deployments: Evolve our Buildkite pipelines and Terraform modules to give engineers <10-minute, one-click rollouts (and clean rollbacks)
Champion security & compliance: Harden infra with least-privilege IAM, threat-model topology changes, and guide SOC 2 / PCI efforts
Partition & scale data-stores: Tune Postgres for multi-TB workloads, maintain Mongo sharding, and shepherd Kafka topic management as event volume climbs
Lead incident response: Rotate with the on-call SREs, run blameless post-mortems, and convert findings into durable fixes
Mentor & collaborate: Pair with product engineers on capacity reviews, guide junior devs on Docker best-practices, and evangelize “you build it, you run it.”

Preferred Qualifications

Experience with NestJS or other Node.js backends at scale
Prior work in PCI-DSS or SOC 2 environments
Familiarity with GitOps workflows (Argo CD, Flux)
Exposure to mobile CI (React-Native pipelines), LaunchDarkly/feature-flags, or chaos-engineering

Benefits

The estimated base salary range for this role is $170K - $210K, plus a generous pre-IPO equity package
100% remote across the U.S. or Canada (option to drop into our SF office)
Comprehensive health, dental, and vision coverage
Home-office stipend, top-tier laptop, and any tools you need to excel
Twice-annual team off-sites

Senior Site Reliability Engineer

Owner.com

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

ServiceNow

Remote

DevOps

Senior