Summary
Join Canva's Observability Traces & Exceptions Team and help redefine how the world experiences design. Based in Sydney with options to work in Melbourne, Brisbane, Perth, or Adelaide, you will build and improve the observability platform used by all Canva engineers. This role involves providing technical leadership, brainstorming solutions, optimizing tracing and exceptions platforms, improving user experience, and enhancing exception workflows. You will also participate in team ceremonies and champion observability best practices. Canva offers a flexible work environment and various benefits.
Requirements
- You are proficient and happy to code in Python, Java or Golang
- You have deep knowledge and understanding of Computer Engineering fundamentals and first principles
- You have a solid knowledge of AWS (EC2, EKS, Lambda, SQS, Kinesis, S3) or equivalent
- You have experience deploying and running containerized workloads on a platform like Kubernetes
- You have experience with Observability Tooling β having competency with tools like Elasticsearch, Grafana, Sentry, Jaegar Tracing or similar
- Experience running highly available and reliable distributed systems, with highly scalable data stores
- You are proficient with infrastructure-as-code - weβre a Terraform shop, but strong experience with other IaC tools will do the trick
Responsibilities
- Being responsible for building and improving our observability platform and tooling, which is used by all Canva engineers
- Providing technical leadership and expertise to drive pragmatic solutions and achieve impactful design decisions
- Brainstorming, researching and prototyping to optimize our tracing and exceptions platforms, improve our operational effectiveness and increase reliability
- Being proactive in improving the tracing user experience and advocating for best practices
- Finding ways to improve the use of traces and exceptions, providing better insights to our engineers
- Enhancing our exception workflow to help engineers seamlessly capture errors, gain actionable insights through clear visualizations, and set up high-signal, low-noise alerts
- Participating in team ceremonies, knowledge sharing and brainstorming sessions
- Becoming an observability champion, evangelising best practices and guiding other Canvanauts in the observability space
Preferred Qualifications
- You have experience with OpenTelemetry because it underpins a lot of the infrastructure and tooling that the team owns
- You have experience writing application code in Java or frontend code in TypeScript, since we also maintain the tracing libraries
- You have experience building and running monitoring infrastructure at scale. For example, Petabyte-scale Elasticsearch clusters or similar databases
- You have experience with data handling at scale
- You have experience with Clickhouse
- You have experience with data security, data obfuscation and PII detection
Benefits
- Equity packages - we want our success to be yours too
- Inclusive parental leave policy that supports all parents & carers
- An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
- Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.