Software Engineer, Analytics Platform

OpenAI Logo

OpenAI

πŸ“Remote - United States

Summary

Join OpenAI's Research Platform Analytics team as a pragmatic and passionate engineer focused on enhancing the data experience for engineers and scientists. You will build and maintain large-scale data processing pipelines, develop a general-purpose data platform for petabyte-scale datasets, and ensure the scalability and reliability of our infrastructure. This role involves hands-on infrastructure work, including deploying and troubleshooting core services. The position is based in San Francisco or remote within the US, utilizing a hybrid work model. You'll collaborate with various teams to deliver impactful data tooling and systems, contributing to OpenAI's mission of accelerating research towards AGI.

Requirements

  • Proficient in Python and backend development, with experience working in large codebases (monorepos)
  • Experience building and operating large-scale stream and batch processing pipelines (Kafka, Spark, Flink, Presto/Trino)
  • Hands-on experience with Kubernetes, Terraform, and deploying/troubleshooting production systems
  • Worked on access control, provenance, auditing, and large-scale data movement
  • Passion for building systems that provide key insights, especially in ML training workflows
  • Comfortable in a fast-moving environment, making trade-offs to deliver impact quickly

Responsibilities

  • Build and maintain large-scale stream and batch processing pipelines (Kafka, Spark, Flink, Trino/Presto)
  • Develop a general-purpose data processing platform for handling massive datasets
  • Scale applications for ML research, ensuring smooth operation as workloads grow
  • Ensure the security, integrity, and compliance of data according to industry and company standards
  • Ensure our analytics and data platforms can scale reliably to the next several orders of magnitude
  • Accelerate company productivity by empowering your fellow engineers, researchers, and teammates with excellent data tooling and systems, providing a best in case experience
  • Bring new features and capabilities to the world by partnering with product engineers, trust & safety and other teams to build the technical foundations
  • Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed

Preferred Qualifications

Understanding of data transformations in ML training and inference workflows is a plus

Benefits

  • This role is based in San Francisco, CA or open to being remote within the US
  • We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.