Platform Engineer II

Iterable Logo

Iterable

💵 $114k-$188k
📍Remote - United States

Summary

Join Iterable's Observability team as a Platform Engineer II and make a significant impact on system reliability. You will own and scale the observability stack, instrument and automate monitoring, and collaborate with engineering teams to improve system visibility. This role requires hands-on experience with Kubernetes, cloud providers (AWS preferred), and observability platforms. You will contribute code, reduce MTTR and costs, and participate in on-call rotations. Iterable offers a collaborative and inclusive work environment with excellent benefits and perks.

Requirements

  • 2+ years of professional software or infrastructure, SRE experience
  • Hands-on work with Kubernetes (and Docker) in production
  • Deep experience with at least one cloud provider (AWS preferred) and Infrastructure-as-Code (Terraform, Helm, GitOps)
  • Strong programming/scripting skills in Python, Go, or similar
  • Experience using or supporting observability platforms (Datadog, Prometheus, Elastic, OpenTelemetry, etc.) in a production environment
  • Familiarity with CI/CD pipelines and modern DevOps practices
  • A growth mindset, humility, and a desire to elevate those around you
  • Bachelor’s degree in CS/Engineering — or the equivalent real-world experience

Responsibilities

  • Own the full observability stack (Datadog, Prometheus, Grafana, Elasticsearch, Quickwit, OpenTelemetry)—design, deploy, and scale it to support petabyte-scale telemetry
  • Instrument and automate monitoring, logging, tracing, and metrics to ensure system visibility across 100+ services and multiple Kubernetes clusters
  • Ship platform features —contribute code that boosts reliability, performance, and developer experience across Iterable
  • Partner with engineering teams to improve instrumentation, refine dashboards/alerts, and embed observability into their SDLC
  • Reduce MTTR & cost —design cost-effective telemetry pipelines and create high-signal, low-noise alerting strategies
  • Participate in our on-call rotation that prioritizes recovery, postmortems, and continuous improvement

Preferred Qualifications

  • Built or run OpenTelemetry Collectors at scale
  • Operated large K8s clusters or written controllers/operators
  • Experience with GitOps
  • Designed and executed observability cost optimization initiatives
  • Experience in distributed tracing and high-cardinality metrics strategies

Benefits

  • Paid parental leave
  • Competitive salaries, meaningful equity, & 401(k) plan
  • Medical, dental, vision, & life insurance
  • Balance Days (additional paid holidays)
  • Fertility & Adoption Assistance
  • Paid Sabbatical
  • Flexible PTO
  • Monthly Employee Wellness allowance
  • Monthly Professional Development allowance
  • Pre-tax commuter benefits
  • Complete laptop workstation

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.