Principal Cloud Platform Engineer - Core Infra

Sift
Summary
Join Sift's Core Platform team as a Principal Engineer and take a critical leadership role in shaping the technical strategy, architecture, and direction of our infrastructure and services platform. Drive initiatives to enhance the availability, reliability, scalability, and performance of our systems, ensuring they are resilient and secure. This hands-on role requires solving complex distributed systems challenges, mentoring others, and influencing technology and organizational strategy. You will provide technical leadership, drive architecture and design of infrastructure and services, lead sophisticated multi-region deployments, solve high-scale problems, and guide improvements to developer workflows. Additionally, you will architect and build robust internal libraries, develop proactive monitoring systems, act as a strategic advisor, participate in on-call strategy, and mentor senior engineers. This position requires a strong background in distributed systems and cloud-native systems.
Requirements
- 10+ years of experience in Software Engineering, SRE, or Infrastructure roles, with a demonstrated focus on distributed systems and platform-level challenges
- Deep expertise in designing, scaling, and operating cloud-native systems on AWS or GCP
- Proven ability to architect infrastructure as code with tools like Terraform or CloudFormation
- Advanced programming skills in languages such as Java, Python, or Scala
- Extensive experience with messaging systems (e.g., Kafka) and distributed databases (e.g., BigTable, Snowflake)
- Strong knowledge of containerization and orchestration technologies such as Kubernetes
- A track record of reducing operational complexity through automation, observability, and self-healing systems
- Experience influencing architectural decisions across multiple engineering teams
- Excellent collaboration and communication skills, with a history of mentoring and cross-functional leadership
Responsibilities
- Provide technical leadership and vision for Sift’s online infrastructure—ensuring it is highly available, performant, and scalable
- Drive architecture and design of immutable, fault-tolerant, multi-region infrastructure and services
- Lead the implementation of sophisticated multi-region deployments (e.g., BigTable clusters with regional routing strategies) to meet global customer needs
- Solve high-scale, high-throughput problems requiring deep understanding of messaging systems, distributed data stores, and real-time computation
- Guide improvements to developer workflows, CI/CD pipelines, and local development environments to streamline efficiency across teams
- Architect and build robust internal libraries and platforms for interacting with our core systems—data stores, messaging layers, and infrastructure services
- Develop proactive monitoring and self-healing systems to improve the resilience of critical services
- Act as a strategic advisor to engineering teams, providing deep technical guidance on data architecture, service optimization, caching strategies, and scalability planning
- Participate in and help evolve our on-call strategy, ensuring rapid and effective incident response while reducing long-term operational toil
- Mentor and coach senior engineers across teams, driving engineering excellence and knowledge sharing
Benefits
- Competitive total compensation package
- 401k plan
- Medical, dental, and vision coverage
- Wellness reimbursement
- Education reimbursement
- Flexible time off