Senior Application Reliability Engineer

Natera Logo

Natera

πŸ’΅ $124k-$155k
πŸ“Remote - United States

Summary

Join Natera's Provisioning Team as a Senior Site Reliability Engineer (SRE) and play a key role in designing, building, and maintaining the infrastructure powering our software delivery. You will drive automation, scalability, and operational excellence across development and production infrastructure, collaborating with cross-functional teams to enhance reliability and optimize deployment strategies. Lead architectural discussions, build highly available infrastructure in AWS using Kubernetes and Terraform, and develop automation for resource provisioning. Define and monitor SLIs, SLOs, and SLAs, conduct capacity planning, and work with development teams to identify improvement areas. Promote SRE best practices and mentor other engineers.

Requirements

  • BS in Computer Science, Engineering, or a related field, or equivalent practical experience
  • Minimum 5 years of experience in an SRE, Infrastructure, or DevOps role with increasing responsibility
  • Strong problem-solving and analytical skills; Strong ability to troubleshoot complex issues ranging from system resourcing, network issues, to application stack traces
  • Proven leadership capabilities, including mentoring peers, driving cross-team collaboration, and leading technical initiatives
  • Demonstrates strategic thinking and vision in building scalable, maintainable infrastructure and developer platforms
  • Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog)
  • Strong proficiency in programming or scripting languages (e.g., Java or Python)
  • Hands-on experience with Kubernetes, Docker, and infrastructure-as-code tools (e.g., Terraform)
  • Proven expertise in managing AWS Cloud Infrastructure
  • Experience in Linux/Unix administration
  • Ability to read and understand Java and Python code
  • Excellent communication and collaboration abilities. Be able to justify and stand for the proper solution
  • Ability to work effectively in a cross-functional, fast-paced environment

Responsibilities

  • Lead infrastructure-related architectural discussions and decision-making processes
  • Design, build, and maintain highly available, scalable, and secure infrastructure in AWS using technologies such as Kubernetes and Terraform
  • Develop and improve automation for resource provisioning, infrastructure lifecycle management, and environment consistency
  • Define and monitor SLIs, SLOs, and SLAs to ensure operational excellence
  • Conduct capacity planning, performance analysis, and load testing to ensure infrastructure scalability and efficiency
  • Work closely with development teams in all SDLC phases to investigate improvement areas and identify bottlenecks
  • Promote SRE best practices across the organization, guiding teams toward building reliable and maintainable systems

Preferred Qualifications

  • Knowledge of database operations and performance optimization
  • Experience with GitLab
  • Experience with Atlassian services
  • Experience programming in Java or other OOP languages

Benefits

  • Comprehensive medical, dental, vision, life and disability plans for eligible employees and their dependents
  • Free testing in addition to fertility care benefits
  • Pregnancy and baby bonding leave
  • 401k benefits
  • Commuter benefits
  • A generous employee referral program

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.