Senior Cloud Infrastructure Engineer

TaskRabbit
Summary
Join Taskrabbit, a remote-first company, as a Senior DevOps Engineer to lead the next phase of growth. You will build and maintain CI/CD pipelines, monitor and resolve issues, engage in capacity planning, design for zero-downtime, ensure system security, automate tasks, and implement disaster recovery. This role requires 5+ years of experience in infrastructure and DevOps, strong AWS knowledge, experience with microservices, and excellent communication skills. Taskrabbit offers competitive compensation, including a base pay range of $115,000-$160,000, employer-paid health insurance, 401k matching, generous time off, and various other perks. The role is remote-first with limited travel, transitioning to a hybrid model in 2025. Taskrabbit values diversity and inclusion, creating a supportive and collaborative work environment.
Requirements
- At least 5+ years of experience in Infrastructure and DevOps Space
- Experience with build automation and configuration management tools (e.g. Ansible, Puppet, Chef.)
- Strong knowledge of the Amazon Web Services (AWS) ecosystem and other core AWS technologies, ElasticSearch Service, RDS, WAF, CloudFront, Kubernetes etc
- You have worked with common infrastructure tools like Docker, Terraform, Helm, Github Actions, ArgoCD
- Experience with a microservice architecture running in containers (Docker or other containerisation technology)
- Experience supporting 24x7, high availability internet application environments that include web, application, and database servers and load balancing systems
- Experience working with a product that has end-users
- Bachelor's degree or higher in Computer Science, or equivalent experience
- Excellent written and communication skills
- A strong ownership attitude and a track record of taking responsibility for problems and pushing through to resolution
Responsibilities
- Build and maintain CI / CD pipelines from scratch for testing and releasing configuration and software
- Monitor and resolve issues in all environments using tools such as DataDog, PagerDuty, AWS logs
- Engage in capacity planning and demand forecasting, anticipating performance bottlenecks, and scaling the environment as needed using DataDog and other tools
- Design and implement zero-downtime to accomplish highly available service (99.9%)
- Ensure systems are secure against cyberthreats and implement fixes for Security vulnerabilities
- Automate tasks and develop tools to increase engineering efficiency and visibility
- Design and implement disaster recovery (DR) between different regions in cloud providers such as AWS
- Manage web domain and certificates
- Troubleshoot production and testing environment issues, including performance and function issues
- Provide support to the organization through on-call, resolving issues and driving infrastructure changes
- Identify, define and document system requirements and recommend solutions to management
- Perform on-call duties and be part of the on-call rotations
Preferred Qualifications
- AWS Certification
- Software development background
- Experience in a startup environment
Benefits
- Employer-paid health insurance
- 401k match with immediate vesting for our US based employees
- Generous and flexible time off with 2 company-wide closure weeks
- Taskrabbit product stipends
- Wellness + productivity + education stipends
- IKEA discounts
- Reproductive health support
- Remote-First Company