Site Reliability Engineer, Observability Engineer at Rackspace Technology

Summary

Join Rackspace's Professional Services Center of Excellence and contribute to building next-generation applications for our customers. As a Site Reliability Engineer/Observability Engineer, you will work with customers to implement observability solutions, build and maintain scalable systems, and develop monitoring tools. You will collaborate with development teams and proactively identify and resolve performance issues. This remote, full-time position requires a Bachelor's degree in engineering/computer science or equivalent and senior-level experience in SRE, DevOps, and AWS. Rackspace offers a dynamic work environment and is committed to equal employment opportunity.

Requirements

Bachelor’s degree in engineering/computer science or equivalent
Senior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability
Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc
Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
Proactive approach to identifying problems and solutions
Experience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby,Linux Shell
Experience with Terraform or Cloud Formation scripting
Experience with configuration management tools like Ansible, Chef or Puppet
Experience with standard software development best practices and tools such as code repositories (Git preferred)
Experience executing in an agile software development environment
Good understanding of pricing/cost models across AWS services, especially compute, storage, and database offerings
A clear understanding of network & system Management solutions
Excellent organizational and project management skills
Excellent communication, critical thinking & analytical skills

Responsibilities

Work with customers and implement Observability solutions
Build and maintain scalable systems and robust automation that supports engineering goals
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance
Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation
Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards
Collaborate with team members to document and share solutions
Maintain a deep understanding of the customer’s business as well as their technical environment
Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues

Benefits

Remote work

Site Reliability Engineer, Observability Engineer

Rackspace Technology

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Manager

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

GoDaddy

Remote

DevOps

Mid-level

Remote

DevOps

Senior