Site Reliability Consultant

Pythian Logo

Pythian

📍Remote - India

Summary

Join Pythian as a Site Reliability Consultant and become a technology leader and trusted advisor to our customers. You will mentor teammates, focusing on infrastructure design, CI/CD pipeline automation, and building intelligent monitoring systems across various technologies. Your expertise in Git, artifact repositories, and Kubernetes will be crucial. You will operate and maintain platforms, implement CI/CD pipelines, design observability solutions, and participate in on-call rotations. You will also collaborate with clients, provide technical direction, create documentation, and mentor colleagues. This remote position offers a competitive salary and benefits package, including flexible work arrangements and professional development opportunities.

Requirements

  • Must have strong experience with container orchestration (Kubernetes, Docker) in cloud (AWS EKS) or on-prem distributions
  • Familiarity with related ecosystem tools (Helm, Operators, GitOps, etc.)
  • Hands-on experience using AWS (VPC, EC2, EKS, IAM, S3, etc.), including provisioning with IaC tools like Terraform (or AWS CloudFormation)
  • Experience setting up GitLab or similar platforms (GitHub, Bitbucket) for CI/CD pipelines, managing runners, and integrating code scanning
  • Familiarity with artifact repository solutions (e.g., JFrog Artifactory), including repository creation, access controls, and automation of artifact flows
  • Track record of infrastructure automation using Terraform, Ansible, Puppet, or Chef to reduce manual intervention and ensure repeatable deployments
  • Strong scripting skills (Bash, Python, Go, etc.) to automate system tasks and streamline operational workflows
  • Experience with modern monitoring stacks (Prometheus, Dynatrace, Grafana, ELK/EFK) for analyzing logs, metrics, and traces
  • Proven ability to design alerts, dashboards, and runbooks that enable rapid first-contact resolution
  • Solid understanding of Linux-based systems, performance tuning, and troubleshooting
  • Network fundamentals (TCP/IP, load balancers, DNS, NTP, etc.) and ability to diagnose connectivity or performance issues in complex distributed environments
  • Familiarity with container security best practices (RBAC, TLS, vulnerability scanning) and how to apply them at scale
  • Adept at communicating technical concepts to both engineering and non-technical stakeholders
  • Ability to mentor junior team members, champion DevOps culture, and contribute to an inclusive, knowledge-sharing environment
  • Bachelor’s Degree in Computer Science, Information Systems, or equivalent experience
  • Several years of progressive DevOps or SRE experience managing large-scale systems in a production environment

Responsibilities

  • Administer and optimize platforms such as GitLab (CI/CD pipelines, runners) and artifact repository solutions (e.g., JFrog Artifactory)
  • Maintain and troubleshoot Kubernetes clusters—either in the cloud (AWS EKS) or on-prem distributions—with a focus on availability, performance, and security
  • Champion “infrastructure as code” using tools like Terraform (or CloudFormation), building repeatable processes for provisioning and updating clusters, repos, and associated services
  • Implement or improve CI/CD pipelines to reduce manual toil and ensure quick, reliable deployments across multiple environments
  • Design and configure observability solutions (e.g., Prometheus, Dynatrace, Grafana) to proactively detect and address issues in container orchestration environments, code repositories, and artifact repositories
  • Participate in an on-call rotation, troubleshooting incidents at all tiers (from first-contact resolution to escalation) and driving continuous improvement based on Root Cause Analysis
  • Collaborate with clients to shape infrastructure strategies around container orchestration, secure CI/CD, and DevSecOps best practices
  • Provide leadership and technical direction on automating repetitive administrative tasks, enforcing security policies (RBAC, TLS, container scanning), and adopting GitOps workflows
  • Create and maintain design documents, runbooks, and operational playbooks for container platforms, CI/CD pipelines, and code management services
  • Mentor fellow consultants and client stakeholders on Kubernetes, infrastructure automation, and advanced CI/CD usage to enhance knowledge across the organization
  • Plan and coordinate maintenance activities, ensuring minimal downtime and clear communication with stakeholders
  • Provide ITIL-oriented support (Incident, Change, Problem Management), and champion continuous improvement of operational processes and service reliability

Preferred Qualifications

  • AWS certifications (Solutions Architect, DevOps Engineer) are a plus
  • Understanding of compliance frameworks (HIPAA, PCI, etc.) and data privacy constraints a plus
  • Experience or strong interest in leveraging AI-based services or scripts for operational efficiency and faster issue resolution is highly desirable

Benefits

  • Competitive total rewards and salary package
  • Flexibly work remotely from your home, there’s no daily travel requirement to an office!
  • Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like!
  • We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!
  • You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more)
  • You will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.