Machine Learning Researcher/Engineer

Logo of Pathway

Pathway

๐Ÿ“Remote - United States

Job highlights

Summary

Join Pathway, a VC-funded AI startup, as an R&D Engineer specializing in attention-based models. This is a foundational project with a substantial GPU budget. You will perform distributed model training, improve model architectures, design experiments, and potentially oversee data preparation. The role requires a strong machine learning research background, experience with model training (PyTorch, Jax, or TensorFlow), and a good understanding of GPU architecture. Remote work is possible, with offices in Menlo Park, Paris, and Wroclaw. Compensation includes a six-figure salary and an Employee Stock Option Plan.

Requirements

  • You have published at least one paper at NeurIPS, ICLR, or ICML - where you were the lead author or made significant conceptual & code contributions
  • You have significantly contributed to an LLM training effort which became newsworthy (topped a Huggingface benchmark, best in class model, etc.), preferably using multiple GPU's
  • You have spent at least 6 months working in a leading Machine Learning research center (e.g. at: Google Brain / Deepmind, Apple, Meta, Anthropic, Nvidia, MILA)
  • You were an ICPC World Finalist, or an IOI, IMO, or IPhO medalist in High School
  • Be a deep learning researcher, with a track record in Language Models and/or RL (candidates with a Vision or Robotics ML background are also welcome to apply)
  • Be interested in improving foundational architectures and creating new benchmarks
  • Be experienced at hands-on experiments and model training (PyTorch, Jax, or Tensorflow)
  • Have a good understanding of GPU architecture, memory design, and communication
  • Have a good understanding of graph algorithms
  • Have some familiarity with model monitoring, git, build systems, and CI/CD
  • Be respectful of others
  • Be fluent in English

Responsibilities

  • Perform (distributed) model training
  • Help improve/adapt model architectures based on experiment results
  • Design new tasks and experiments
  • Optionally: oversee activities of team members involved in data preparation

Preferred Qualifications

  • Knowledge of approaches used in distributed training
  • Familiarity with Triton
  • Successful track-record in algorithms & data science contests
  • Showing a code portfolio

Benefits

  • Six-digit annual salary based on profile and location + Employee Stock Option Plan
  • Remote work. Possibility to work or meet with other team members in one of our offices: Menlo Park, CA, Paris, France, or Wroclaw, Poland

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.