Machine Learning Researcher / Engineer

Pathway
Summary
Join Pathway, a company building Live AIβ’ systems, as an R&D Engineer to contribute to an ambitious foundational project involving attention-based models. This role offers a flexible GPU budget and the opportunity to perform distributed model training, improve model architectures, design new tasks and experiments, and potentially oversee data preparation activities. The position requires a strong track record in machine learning model research and hands-on experience with model training using PyTorch, Jax, or TensorFlow. Successful candidates will possess a deep understanding of GPU architecture, graph algorithms, and have some familiarity with model monitoring and CI/CD. Remote work is possible, with offices in Palo Alto, Paris, and Wroclaw. Compensation includes a six-digit annual salary and an Employee Stock Option Plan.
Requirements
- You have published at least one paper at NeurIPS, ICLR, or ICML - where you were the lead author or made significant conceptual & code contributions
- You have significantly contributed to an LLM training effort which became newsworthy (topped a Huggingface benchmark, best in class model, etc.), preferably using multiple GPU's
- You have spent at least 6 months working in a leading Machine Learning research center (e.g. at: Google Brain / Deepmind, Apple, Meta, Anthropic, Nvidia, MILA)
- You were an ICPC World Finalist, or an IOI, IMO, or IPhO medalist in High School
- A deep learning researcher , with a track record in Language Models and/or RL (candidates with a Vision or Robotics ML background are also welcome to apply)
- Interested in improving foundational architectures and creating new benchmarks
- Experienced at hands-on experiments and model training (PyTorch, Jax, or Tensorflow)
- Have a good understanding of GPU architecture, memory design, and communication
- Have a good understanding of graph algorithms
- Have some familiarity with model monitoring, git , build systems , and CI/CD
- Respectful of others
- Fluent in English
Responsibilities
- Perform (distributed) model training
- Help improve/adapt model architectures based on experiment results
- Design new tasks and experiments
- Optionally: oversee activities of team members involved in data preparation
Preferred Qualifications
- Knowledge of approaches used in distributed training
- Familiarity with Triton
- Successful track-record in algorithms & data science contests
- Showing a code portfolio
Benefits
- Six-digit annual salary based on profile and location + Employee Stock Option Plan
- Remote work. Possibility to work or meet with other team members in one of our offices: Palo Alto, CA; Paris, France or Wroclaw, Poland
Share this job:
Similar Remote Jobs
