Staff Site Reliability Engineer

Invisible Technologies Logo

Invisible Technologies

πŸ’΅ $162k-$198k
πŸ“Remote - United States

Summary

Join Invisible Technologies, a rapidly growing AI training and scaling partner, as a professional engineer. You will play a key role in ensuring the reliability and automation of our products, working closely with engineering and product teams. This position requires strong expertise in cloud architecture, Kubernetes, and infrastructure-as-code tools. You will be responsible for designing, deploying, and managing cloud-based infrastructure, optimizing performance, and building robust monitoring systems. Invisible offers competitive compensation, remote work flexibility, and equity, fostering a culture of ownership and innovation within a dynamic and fast-paced environment. We are committed to providing a transparent and equitable compensation structure.

Requirements

  • Strong understanding of cloud architecture including expertise with major cloud providers (GCP, AWS, Azure)
  • Proficiency in a programming language and ability to write production code beyond just scripting
  • Understand underlying networking and security considerations when developing the architecture of our deployment environments
  • Strong understanding of Relational Databases (PostgreSQL) and be comfortable optimizing and advising the broader engineering team on optimization techniques to ensure the data layer of our deployed services run smoothly
  • Strong understanding of authentication and authorization principles such as IAM, Security Groups, RBAC, etc
  • Understanding of software engineering fundamentals, practices, and patterns with distributed cloud services
  • Strong experience with production systems troubleshooting and optimization
  • Experience with Kubernetes and be able to point to deployments they have architected or managed
  • Strong understanding of the operating model of Kubernetes and be able to explain the requirements for designing deployments for new applications
  • Experience with infrastructure as code tools such as Terraform or CloudFormation

Responsibilities

  • Ensure the availability, performance, and scalability of production systems
  • Deploy, configure, automate, and manage cloud-based infrastructure using tools like Kubernetes, Terraform, and Argo
  • Identify and resolve system bottlenecks, optimizing for performance and cost efficiency across engineering teams
  • Design, support, and manage deployment pipelines to enable world class delivery of applications
  • Design, develop, and maintain comprehensive monitoring and observability systems using Datadog and Sentry
  • Define Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure reliability and performance
  • Design and implement automated solutions to reduce manual operational tasks
  • Build tools for system provisioning, monitoring, deployment, and scaling
  • Collaborate closely within engineering teams to improve application reliability, resilience, and maturity

Benefits

  • Remote work around the world on a schedule that suits their lifestyle
  • Partner Pay Model is fully transparent and designed for co-ownership
  • Bonuses and equity are included in offers above entry level
  • Over 65% ownership is in the hands of our Partners, and we’re committed to buying back Partner shares every year according to our formal liquidity plan. This ensures liquidity for those who choose to sell their stake in the company

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs