
Senior Software Engineer

Astronomer
Summary
Join Astronomer's R&D team as a DevOps-focused Software Engineer and contribute to the infrastructure team of our flagship Enterprise product. Enhance scalability, performance, and reliability of our platform while minimizing operational overhead. Leverage your expertise in container orchestration (Kubernetes) and cloud platforms (AWS, Azure, GCP, Openshift). Collaborate with cross-functional teams to drive continuous improvement and implement robust security measures. Utilize monitoring tools (ELK, Prometheus) to optimize system performance and resource utilization. This Hyderabad-based role requires a strong software engineering foundation and a passion for building, operating, and optimizing infrastructure and deployment platforms.
Requirements
- 5 years of hands-on experience operating Kubernetes clusters in a production environment
- 5+ years of software development experience in Python/Golang
- Strong experience with at least one Continuous Integration system, such as CircleCI or Jenkins
- Automation/Scripting experience with Shell, Python, or similar
- Familiarity with Infrastructure as Code (IaC) tools (Terraform, Cloudformation, etc.)
- Experience in managing and scaling distributed systems in one of the three major cloud providers (AWS, Azure, GCP)
- Understanding of the Linux Operating System, standard networking protocols, and components
- Experience with deploying, supporting, and monitoring new and existing services, platforms, and application stacks
- Strong troubleshooting and problem-solving skills
Responsibilities
- Serve as a primary point who is responsible for the overall health, performance, and capacity of our platform
- Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and growth
- Develop tools to improve our ability to rapidly deploy and effectively monitor applications in a large-scale environment
- Work closely with development teams to ensure the platform is designed with operability in mind
- Identify and lead efforts to improve automation
- Perform root cause analysis and document results in the form of post-mortems
- Write and maintain documentation around key systems and processes
- Participate in an on-call rotation with some of our customers
- Function well in a fast-paced, rapidly changing environment
Preferred Qualifications
- Experience with scale testing, disaster recovery, and capacity planning
- Experience in Service Mesh like Istio/Envoy etc
- Familiarity with Apache Airflow
- Experience with Openshift and the Red Hat marketplace
- Experience with the Prometheus/Grafana and ELK stacks
Benefits
Remote work, flexible hours
Share this job:
Similar Remote Jobs

