Staff Platform Engineer

Contentsquare
Summary
Join Contentsquare's Data Storage and Streaming team as a Staff Platform Engineer and play a strategic and hands-on role in maintaining and evolving large-scale infrastructure for industry-leading analytics and UX solutions. You will serve as a senior technical authority, partnering with engineering teams, mentoring others, and taking ownership of managed services. Responsibilities include improving system reliability, scalability, and developer experience, implementing best practices for monitoring and incident response, and building and maintaining Infrastructure as Code (IaC) frameworks. This role requires extensive experience in managing large-scale infrastructure, expertise in cloud ecosystems (AWS and/or Azure), and deep understanding of platform tools and technologies. Contentsquare offers competitive benefits, including flexible work arrangements, generous paid time off, parental leave, wellbeing allowances, stock options, and employee resource groups.
Requirements
- 10+ years of experience in managing large-scale infrastructure and platform services in cloud environments
- Expert knowledge of cloud ecosystems (AWS and/or Azure), including managed services (RDS, Elasticache, etc.)
- Deep understanding of platform tools and technologies such as Kafka, Aerospike, Kubernetes, and Terraform
- Proven experience developing and implementing scalable automation solutions to reduce platform maintenance overhead
- Strong programming skills in languages such as Go, Python, or equivalent
- Demonstrated success in leading cross-functional projects and mentoring technical teams
Responsibilities
- Serve as a senior technical authority guiding infrastructure strategy, architecture, and best practices
- Partner closely with engineering teams to design and build scalable, reliable, and maintainable platform solutions
- Act as a mentor and technical leader, fostering a culture of continuous improvement and technical excellence
- Take ownership of managed services, ensuring uptime, performance, and efficiency across our internal customer environments
- Lead initiatives to continuously improve system reliability, scalability, and developer experience
- Drive improvement projects to develop tools and solutions that reduce the toil of maintaining a complex platform, enabling more autonomous teams
- Define and implement best practices for monitoring, alerting, and incident response across managed services
- Maintain a strong focus on system resilience, fault tolerance, and automation
- Lead root cause analysis and implement proactive solutions to recurring platform issues
- Build and maintain IaC frameworks (e.g., Crossplane, Terraform, Ansible) to standardize and automate infrastructure deployment
- Identify opportunities to automate manual processes, reducing toil and improving platform reliability
- Develop tooling and frameworks for seamless management of Kafka, Aerospike, and other managed services
- Ensure comprehensive logging, metrics, and monitoring are available for teams to build actionable dashboards
- Optimize the platform’s performance and capacity planning
Benefits
- Virtual onboarding, Hackathon, and various opportunities to interact with your team and global colleagues both on and offsite each year
- Work flexibility: hybrid and remote work policies
- Generous paid time-off policy (every location is different)
- Immediate eligibility for birthing and non-birthing parental leave
- Wellbeing and Home Office allowances
- A Culture Crew in every country we’re based in to coordinate regular activities for employees to get to know each other and bond outside of work
- Every full-time employee receives stock options, allowing them to share in the company’s success
- We have multiple Employee Resource Groups, that offer a safe space for individuals who share common identities, life experiences, or allyship to connect, support one another, and passionately advocate for the issues close to their hearts
- And more benefits tailored to each country