Summary
Join Mixpanel's DevInfra team as a highly experienced engineer to build and maintain the software development lifecycle. Partner with various engineering teams to optimize for speed, safety, and reliability, supporting systems ingesting over 1 trillion events monthly. Manage Kubernetes at scale, oversee cloud infrastructure (preferably GCP), and own observability pipelines. Maintain CI/CD pipelines, devbox systems, and SaaS tooling. Lead by example, mentor engineers, and thrive in a fast-paced, innovative environment. Mixpanel offers competitive compensation and benefits, including comprehensive medical, vision, and dental care, generous vacation, enhanced parental leave, and more.
Requirements
- 8+ years of coding experience
- Expert in at least one programming language ecosystem
- Production experience managing Kubernetes at scale
- Production experience managing infrastructure in a major cloud provider such as Google Cloud Platform, Amazon Web Services, or Azure
- Proficiency with observability solutions like OpenTelemetry, Prometheus, or Distributed Tracing for application monitoring and performance analysis
Responsibilities
- Partner with a wide variety of teams to build out their software development lifecycle to be optimized for speed, safety, and reliability
- Support teams writing front-end UI code as well as teams maintaining our highly stateful storage systems deep in our stack
- Support systems that ingest more than 1 trillion user-generated events every month while ensuring end-to-end latencies of under a minute
- Mixpanel queries typically scan more than 1 Quadrillion events over the span of a month
- Mixpanel runs entirely on Google Cloud Platform and Google Kubernetes Engine
- DevInfra is the overall admin for our cloud environment, so we wear many hats
- We are responsible for things like Terraform, cost management, networking, and security best practices
- We are the overall service owner for Kubernetes
- Individual teams are responsible for maintaining and monitoring their own clusters and workloads
- We are responsible for setting standards for deployment, observability, and developer experience
- We facilitate efforts like Kubernetes version upgrades and helping teams adopt new Google Kubernetes Engine features
- Service ownership of our observability pipelines
- We have instrumented over 30 million Prometheus metrics time series and use Open Telemetry to ingest over 4 billion distributed tracing spans per month
- We maintain our GitHub Actions based CI/CD pipelines
- We maintain a robust GitHub-native delivery process that empowers teams to self-serve their needs while ensuring safety and reliability
- We maintain our devbox system that provisions cloud development environments for individual developers
- We procure best-in-breed SaaS tooling for the rest of engineering
- We own our relationship with GitHub, Honeycomb, Chronosphere, Sentry, and more
- We run the new engineer onboarding process
- If you join us, your first meeting will probably be with us!
Preferred Qualifications
- Preferably Go, Python, or JavaScript/TypeScript
- Preferably Google Cloud Platform
- You write excellent docs
- Proficient with GitHub Actions and the GitHub ecosystem
- Experience maintaining Continuous Integration jobs
- Experience building deployment pipelines
- We love GitOps!
- Bazel expertise
- Experience with Terraform
- Experience with cloud Identity and Access Management (IAM) systems and knowledgeable about security best practices
- Experience implementing Internal Developer Platforms like Backstage
- Working knowledge of site reliability engineering (SRE) principles such as implementing Service Level Objectives (SLOs)
- You β€οΈ open source
Benefits
- Comprehensive Medical, Vision, and Dental Care
- Mental Wellness Benefit
- Generous Vacation Policy & Additional Company Holidays
- Enhanced Parental Leave
- Volunteer Time Off
- Additional US Benefits: Pre-Tax Benefits including 401(K), Wellness Benefit, Holiday Break