Summary

Join Rackspace's Professional Services Center of Excellence and contribute to building next-generation applications for our customers. You will work with customers to implement observability solutions using tools like Datadog, New Relic, or AppDynamics, build and maintain scalable systems, develop monitoring tools, and collaborate with development teams. This role requires a Bachelor's degree in engineering/computer science or equivalent, senior-level experience in SRE, DevOps, and AWS, and experience with observability solutions. You'll need experience with various technologies and tools, including scripting languages, configuration management, and agile development. Rackspace offers a dynamic work environment and is recognized as a best place to work.

Requirements

Bachelor’s degree in engineering/computer science or equivalent
Senior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability
Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc
Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
Proactive approach to identifying problems and solutions
Experience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby,Linux Shell
Experience with Terraform or Cloud Formation scripting
Experience with configuration management tools like Ansible, Chef or Puppet
Experience with standard software development best practices and tools such as code repositories (Git preferred)
Experience executing in an agile software development environment
Good understanding of pricing/cost models across AWS services, especially compute, storage, and database offerings
A clear understanding of network & system Management solutions
Excellent organizational and project management skills
Excellent communication, critical thinking & analytical skills

Responsibilities

Work with customers and implement Observability solutions
Build and maintain scalable systems and robust automation that supports engineering goals
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance
Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation
Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards
Collaborate with team members to document and share solutions
Maintain a deep understanding of the customer’s business as well as their technical environment
Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues

Site Reliability Engineer, Observability Engineer

Rackspace Technology

Job highlights

Summary

Requirements

Responsibilities

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Flex

Remote

DevOps

Senior

Wizeline

Remote

DevOps

Mid-level

Escape Velocity Entertainment

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

Remote

DevOps

Senior