Staff Software Engineer

Summary
Join Reddit's Compute Reliability and Efficiency team as a Staff Software Engineer and tackle large-scale infrastructure challenges. You will collaborate with a team to build and maintain Reddit's infrastructure platform, perform reliability analysis on the Kubernetes fleet, design and develop software in Golang to enhance platform efficiency, contribute to the platform's technical direction, automate development processes, and share on-call responsibilities. This role requires 7+ years of infrastructure experience with a focus on lower-level systems like Linux, proficiency in Go, Rust, or Python, and a strong understanding of kernel primitives. You'll design large systems, scope work, and build consensus with other engineers. The position offers competitive compensation and a comprehensive benefits package.
Requirements
- 7+ years of experience working in the infrastructure domain β with a focus on lower-level systems such as Linux
- Language proficiency in either Go (Preferred), Rust, or Python
- Understanding of kernel primitives (cgroups, namespaces), cpu scheduling, userspace concerns, and packet processing
- Experience developing on top of Kubernetes or similar distributed systems
- Strong troubleshooting competency ranging from higher-level orchestration concerns to lower-level runtime ones
- Experience designing large systems, scoping work, and building consensus with other engineers
- Excellent communication skills to collaborate with a service-oriented team and company
Responsibilities
- Work collaboratively with a team of software engineers to create and maintain the foundational platform for running Redditβs infrastructure
- Execute performance and reliability analysis on our Linux-based Kubernetes fleet
- Design, write (Golang), and deliver software to improve the availability, scalability, latency, and efficiency of Redditβs Compute Platform
- Contribute feedback to the technical and strategic direction of the compute platform
- Automate critical aspects of the development process such as service creation and management, as well as critical infrastructure operations
- Share on-call responsibilities with the Compute team
Benefits
- Comprehensive Healthcare Benefits
- 401k Matching
- Workspace benefits for your home office
- Personal & Professional development funds
- Family Planning Support
- Flexible Vacation (please use them!) & Reddit Global Wellness Days
- 4+ months paid Parental Leave
- Paid Volunteer time off