Remote Senior Site Reliability Engineer
closedOomnitza
πRemote - United States
Job highlights
Summary
Join our dynamic team at Oomnitza as a highly motivated and experienced Site Reliability Engineer to operate and deliver working systems based on insights gathered from massive scale data in real time, ensuring Oomnitza's internal and external services are reliable while keeping an ever-watchful eye on our systems, capacity, and performance.
Requirements
- Kubernetes: Extensive experience with container orchestration and managing production clusters, focusing on deployment, scaling, and troubleshooting within Kubernetes environments
- Configuration Management: Proficiency in tools like Ansible, Helm, and Kustomize for automating infrastructure provisioning, configuration, and deployment
- Monitoring: Experience with Prometheus, Grafana, or similar to proactively track system health, detect anomalies, and optimize performance across the platform
- AWS Cloud Services: Deep knowledge of the AWS ecosystem, including EC2, S3, IAM, VPC, and other essential services for building and managing scalable infrastructure
- Infrastructure as Code (IaC): Hands-on experience with Terraform to provision and manage cloud resources, ensuring version control, repeatability, and efficiency in infrastructure deployment
- Queuing Systems: Familiarity with message queuing systems like RabbitMQ and Kafka, as well as managed queuing services such as AmazonMQ
- Database Management: Strong background in managing MySQL databases and leveraging Amazon RDS for high availability, performance tuning, and secure database management in cloud environments
- Networking and Security Best Practices: Understanding of network design and security protocols to protect systems, enforce compliance, and meet industry-standard audit requirements
- High-Uptime / Low-Downtime Environments: Experience ensuring high uptime agreements for critical systems, implementing strategies for fault tolerance, disaster recovery, and proactive monitoring to maintain service availability and minimize downtime
- Cross-functional Collaboration: Proven ability to work effectively with cross-functional teams from multiple departments to achieve project goals and execute project plans in an orderly and efficient manner
- Programming Skills: Ability to develop and maintain code in one or more high-level programming languages such as Python, Go, or JavaScript
Responsibilities
- Gather and analyze metrics from our platform and applications to continually improve our performance tuning and fault finding
- Partner with our world-class engineering teams to improve services through rigorous testing and release procedures
- Create sustainable systems and services through automation and uplifts while working closely with engineering professionals within the company to enable projects to be completed efficiently
- Develop, monitor, and manage the entire system landscape by balancing feature development speed and reliability with well-defined service level objectives, ensuring minimal downtime and maximum availability
- Participate in the development and implementation of practices, procedures, and technology to ensure our system landscapes are operating within our Security, Compliance, and Availability commitments
- Plan, prepare, and execute system upgrades
- Mentor and train other engineers throughout the company and seek to continually improve processes company-wide
Benefits
- Healthcare for dependents and spouse
- A progressive, healthy work culture with excellent opportunities for professional and personal development
- Top performers will have an opportunity to help shape the team. Working directly with the founders to drive initiatives and create a structure that scales
- A once-in-a-lifetime career opportunity to get onboard a fast-growing business that is venture-backed by C5 Capital, Shasta Ventures, Riverside Acceleration Capital, and Hummer Winblad
- Dental & Vision Insurance
- Employee equity plan
- Health Insurance for your spouse and dependents
- Pension, Life insurance and Income protection
- Remote working & flexible work schedules Working from home equipment allowance
- Choice of preferred equipment, Mac or PC
This job is filled or no longer available
Similar Remote Jobs
- π°$60k-$120kπAsia
- π°$177k-$213kπUnited States
- πUnited Kingdom
- πUnited States
- πCanada
- πPoland
- π°$167k-$201kπUnited States
- Nπ°$68k-$98kπWorldwide
- π°$125k-$150kπCanada
- π°$154k-$258kπWorldwide