Senior Site Reliability Engineer at Invisible Technologies

Summary

Join Invisible Technologies, a leading AI training and scaling partner, as a professional engineer. You will play a key role in ensuring the availability, performance, and scalability of production systems for one of our core products. As an owner, you will focus on deploying, configuring, and managing cloud-based infrastructure, optimizing for performance and cost efficiency. You will design and maintain monitoring systems, define service level objectives, and build automated solutions. Collaboration with engineering teams to improve application reliability is crucial. The role requires strong cloud architecture understanding, experience with Kubernetes and Terraform, and expertise in relational databases and security principles. Compensation includes a salary range of $68,000-$80,000 USD, plus bonuses and equity for roles above entry level. Invisible is a remote-first organization.

Requirements

Strong understanding of cloud architecture including expertise with major cloud providers (GCP, AWS, Azure)
Understand underlying networking and security considerations when developing the architecture of our deployment environments
Strong understanding of Relational Databases (PostgreSQL) and be comfortable optimizing and advising the broader engineering team on optimization techniques to ensure the data layer of our deployed services run smoothly
Strong understanding of authentication and authorization principles such as IAM, Security Groups, RBAC, etc
Understanding of software engineering fundamentals, practices, and patterns with distributed cloud services
Strong experience with production systems troubleshooting and optimization
Experience with Kubernetes and be able to point to deployments they have architected or managed
Strong understanding of the operating model of Kubernetes and be able to explain the requirements for designing deployments for new applications
Experience with infrastructure as code tools such as Terraform or CloudFormation

Responsibilities

Ensure the availability, performance, and scalability of production systems
Deploy, configure, automate, and manage cloud-based infrastructure using tools like Kubernetes, Terraform, and Argo
Identify and resolve system bottlenecks, optimizing for performance and cost efficiency across engineering teams
Design, support, and manage deployment pipelines to enable world class delivery of applications
Design, develop, and maintain comprehensive monitoring and observability systems using Datadog and Sentry
Define Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure reliability and performance
Design and implement automated solutions to reduce manual operational tasks
Build tools for system provisioning, monitoring, deployment, and scaling
Collaborate closely within engineering teams to improve application reliability, resilience, and maturity

Benefits

Bonuses and equity are included in offers above entry level

Senior Site Reliability Engineer

Invisible Technologies

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior