Staff Cloud Platform Engineer at Sift

Summary

Join Sift's Core Platform team and play a crucial role in maintaining and optimizing the data, infrastructure, messaging, and services platform powering our online systems. You will own the availability, performance, and scalability of our primary online storage systems and infrastructure. Responsibilities include designing and building immutable infrastructure, implementing multi-region deployments, solving complex problems related to data volume and request rate, optimizing development workflows, and developing monitoring tools. You will also provide design support to internal teams and participate in on-call support. This role requires extensive experience in large-scale computing, distributed systems, and cloud infrastructure. Sift offers a competitive compensation package and various benefits.

Requirements

8+ years of experience as a Software Engineer focused on infrastructure/platform services or in a Site Reliability Engineering (SRE) role
Strong programming skills in languages such as Java, Scala, or Python
Experience designing and implementing distributed systems
Experience building and managing cloud infrastructure on AWS or GCP
Expertise in building infrastructure as code and automating provisioning processes using tools like CloudFormation or Terraform
Proficiency in setting up and managing monitoring and alerting systems, both open-source and commercial
Familiarity with Docker and container orchestration technologies like Kubernetes, GKE, or AWS ECS
Strong experience troubleshooting and resolving production system issues, with a focus on building automated solutions to prevent future occurrences
Proven expertise in automation and a solid understanding of configuration management tools

Responsibilities

Own the availability, performance, and scalability of Sift’s primary online storage systems and infrastructure
Design and build immutable infrastructure and fault-tolerant, multi-AZ/multi-region systems that are resilient and self-healing
Design and Implement multi-region deployments, such as BigTable clusters spanning multiple regions, with strategies to ensure specific customers are routed to designated regions (e.g., sticky sessions at the regional level)
Solve complex problems that arise from our unique data volume and request rate which may involve digging deep into data store and messaging internals
Optimize local development and testing workflows to be fast, efficient, and seamless
Design and implement services and libraries for components to interact with data stores, messaging layer and services platform
Develop tools for monitoring, detecting faults, and automatically repairing distributed systems
Provide design support to internal engineering teams for optimal usage of data stores, data growth planning, production workload optimization, messaging, caching and service platform
Participate in on-call support and incident response activities, providing 12/7 coverage for one calendar week approximately once every 3-4 weeks

Benefits

Competitive total compensation package
401k plan
Medical, dental, and vision coverage
Wellness reimbursement
Education reimbursement
Flexible time off

Staff Cloud Platform Engineer

Sift

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

Cybersecurity

Mid-level

Remote

Software Development

Senior

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

ServiceNow

Remote

Cybersecurity

Mid-level

Remote

DevOps

Mid-level