📍India
Site Reliability Consultant

Pythian
📍Remote - Mexico, Costa Rica
Please let Pythian know you found this job on JobsCollider. Thanks! 🙏
Summary
Join Pythian as a Site Reliability Consultant and become a technology leader and trusted advisor to our customers. You will focus on infrastructure design and modernization, CI/CD pipeline automation, and building intelligent monitoring and observability systems. This remote position requires expertise in Kubernetes, AWS, CI/CD, and DevOps automation. You will mentor teammates, collaborate with clients, and participate in on-call rotations. Pythian offers a flexible work environment, opportunities for professional development, and various benefits to support your well-being.
Requirements
- Must have strong experience with container orchestration (Kubernetes, Docker) in cloud (AWS EKS) or on-prem distributions
- Familiarity with related ecosystem tools (Helm, Operators, GitOps, etc.)
- Hands-on experience using AWS (VPC, EC2, EKS, IAM, S3, etc.), including provisioning with IaC tools like Terraform (or AWS CloudFormation)
- Experience setting up GitLab or similar platforms (GitHub, Bitbucket) for CI/CD pipelines, managing runners, and integrating code scanning
- Familiarity with artifact repository solutions (e.g., JFrog Artifactory), including repository creation, access controls, and automation of artifact flows
- Track record of infrastructure automation using Terraform, Ansible, Puppet, or Chef to reduce manual intervention and ensure repeatable deployments
- Strong scripting skills (Bash, Python, Go, etc.) to automate system tasks and streamline operational workflows
- Experience with modern monitoring stacks (Prometheus, Dynatrace, Grafana, ELK/EFK) for analyzing logs, metrics, and traces
- Proven ability to design alerts, dashboards, and runbooks that enable rapid first-contact resolution
- Solid understanding of Linux-based systems, performance tuning, and troubleshooting
- Network fundamentals (TCP/IP, load balancers, DNS, NTP, etc.) and ability to diagnose connectivity or performance issues in complex distributed environments
- Familiarity with container security best practices (RBAC, TLS, vulnerability scanning) and how to apply them at scale
- Adept at communicating technical concepts to both engineering and non-technical stakeholders
- Ability to mentor junior team members, champion DevOps culture, and contribute to an inclusive, knowledge-sharing environment
- Bachelor’s Degree in Computer Science, Information Systems, or equivalent experience
- Several years of progressive DevOps or SRE experience managing large-scale systems in a production environment
Responsibilities
- Administer and optimize platforms such as GitLab (CI/CD pipelines, runners) and artifact repository solutions (e.g., JFrog Artifactory)
- Maintain and troubleshoot Kubernetes clusters—either in the cloud (AWS EKS) or on-prem distributions—with a focus on availability, performance, and security
- Champion “infrastructure as code” using tools like Terraform (or CloudFormation), building repeatable processes for provisioning and updating clusters, repos, and associated services
- Implement or improve CI/CD pipelines to reduce manual toil and ensure quick, reliable deployments across multiple environments
- Design and configure observability solutions (e.g., Prometheus, Dynatrace, Grafana) to proactively detect and address issues in container orchestration environments, code repositories, and artifact repositories
- Participate in an on-call rotation, troubleshooting incidents at all tiers (from first-contact resolution to escalation) and driving continuous improvement based on Root Cause Analysis
- Collaborate with clients to shape infrastructure strategies around container orchestration, secure CI/CD, and DevSecOps best practices
- Provide leadership and technical direction on automating repetitive administrative tasks, enforcing security policies (RBAC, TLS, container scanning), and adopting GitOps workflows
- Create and maintain design documents, runbooks, and operational playbooks for container platforms, CI/CD pipelines, and code management services
- Mentor fellow consultants and client stakeholders on Kubernetes, infrastructure automation, and advanced CI/CD usage to enhance knowledge across the organization
- Plan and coordinate maintenance activities, ensuring minimal downtime and clear communication with stakeholders
- Provide ITIL-oriented support (Incident, Change, Problem Management), and champion continuous improvement of operational processes and service reliability
Preferred Qualifications
- AWS certifications (Solutions Architect, DevOps Engineer) are a plus
- Understanding of compliance frameworks (HIPAA, PCI, etc.) and data privacy constraints a plus
- Experience or strong interest in leveraging AI-based services or scripts for operational efficiency and faster issue resolution is highly desirable
Benefits
- Flexibly work remotely from your home, there’s no daily travel requirement to an office! All you need is a stable internet connection
- Pythian cares about continues learning and provides opportunities to earn certifications (AWS, Kubernetes, Terraform) and expand your skill set across multiple platforms, frameworks, and industries
- We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!
- You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more)
- You will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
📍India
📍India
📍India
📍Brazil
📍Costa Rica
📍India
📍Mexico
📍India