Senior Site Reliability Engineer - Data

Discogs Logo

Discogs

💵 $130k-$140k
📍Remote - United States

Summary

Join Discogs Platform team as a Senior Site Reliability Engineer - Data and contribute to building and supporting performant, cost-effective, and reliable infrastructure. Work closely with engineering squads to optimize relational database architectures, improve Kafka and change data capture stability, and contribute to platform operations. This remote position, open to candidates in OR, WA, CA, CO, TX, and IL, offers a competitive salary ($130,000-$140,000). You will steward data stores, lead Kafka and Kafka Connect reliability efforts, establish data communication standards, and mentor engineering squads on best practices. The role involves writing documentation, working in a containerized environment, participating in on-call rotation, and troubleshooting operational issues.

Requirements

  • A Bachelor's Degree in Computer Science or similar area of focus, or equivalent relevant work experience
  • 5+ years of experience working with Kafka and relational database management systems (RDBMS)
  • 6+ years experience in Ops, DevOps, Site Reliability, Platform or other systems roles
  • Relational database schema design, query performance optimization, administration (MySQL, Percona Server, AWS RDS)
  • Kafka: Cluster administration (Strimzi), Kafka Connect (Debezium, JDBC)
  • CI/CD (GitHub Actions)
  • GitOps (ArgoCD)
  • Kubernetes (EKS, Kustomize, Karpenter, administration, application manifests)
  • AWS and cloud development (VPC, EKS, RDS, S3)
  • Observability (Datadog, Sentry)
  • Scripting (Shell, Python)
  • Track record of collaboration and mentorship
  • Excellent written communication and documentation skills
  • Continuous learning
  • Ownership and proactive approach to solving large problems

Responsibilities

  • Stewarding Discogs’ data stores as a key subject matter expert
  • Leading efforts on the reliability and design patterns of our Kafka and Kafka Connect implementations
  • Establishing data contracts and clear communication standards between CDC producers and consumers
  • Working closely with engineering squads to refactor and re-architect MySQL database schema and indexing for long-term scalability, performance, and cost effectiveness
  • Mentoring engineering squads on Platform best practices for MySQL, Kafka, and other software development lifecycle areas
  • Writing documentation and runbooks that contribute to the engineering organization’s knowledge base
  • Working in a containerized, orchestrated environment
  • Contributing to the Platform team’s disciplines of site reliability and operations, supporting both our squads and Platform’s central infrastructure
  • Participating in on-call rotation, responding to incidents, and troubleshooting data and other operations issues

Preferred Qualifications

  • Infrastructure-as-code (Terraform)
  • Elasticsearch (ECK administration, scaling, performance)
  • Python (SQLAlchemy, FastAPI)
  • GraphQL (schema design, Apollo federation)
  • REST API
  • Hashicorp Vault
  • Redis
  • Memcached
  • NoSQL Database
  • Data Lake/Warehouse
  • Data Governance
  • Data Security

Benefits

  • Competitive compensation: salary, plus performance-related bonus program
  • 401(k) with employer match
  • 100% company-paid medical and dental insurance benefits for you and your dependents
  • 4 weeks paid vacation, increasing based on tenure
  • 18 weeks paid leave for birth moms
  • 8 weeks paid parental leave, including for adoption
  • Monthly wellness allowance
  • Annual professional and personal development allowance
  • Work from home office set-up and expense allowances
  • Flexible work location opportunities
  • Employer matching toward charitable contributions

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs