Summary

Join Together AI's Model Shaping team as a Research Scientist in Large-Scale Learning and contribute to increasing the efficiency of training foundation models. You will analyze state-of-the-art neural network training techniques, propose and implement new approaches, and present your findings at leading ML/ML Systems conferences. Collaborate with Machine Learning Engineers to integrate improvements into Together's platform. This role demands autonomous research design, implementation, and validation, along with strong communication skills. The ideal candidate will have a proven publication record and a passion for applying research to real-world impact.

Requirements

Can autonomously design, implement, and validate your research ideas
Skilled at writing high-quality and efficient code in Python and PyTorch
Have first-author publications at leading conferences on ML or ML Systems (ICLR, ICML, NeurIPS, MLSys)
Are a strong communicator, ready to both discuss your research plans with other scientists and explain them to broader audience
Follow the latest advances in relevant subfields of AI
Passionate about seeing your research create real-world impact through Together AI's services and willing to work hands-on with production systems to achieve it

Responsibilities

Define and drive the research agenda around efficiency and performance of foundation model training
Study recent results from the broader AI research community, analyzing their relevance to the team’s research directions and ongoing projects
Conduct experiments to empirically validate your hypotheses and compare the outcomes with relevant related work
Share your findings both internally and externally (e.g., at top-tier conferences on ML and ML Systems)
Partner with Machine Learning Engineers to integrate advanced methods into Together’s Model Shaping platform

Preferred Qualifications

Algorithmic modifications of large neural network training (e.g., novel optimization algorithms or model adaptation techniques)
Distributed optimization (including federated learning, communication-efficient optimization, and decentralized training)
ML systems optimizations for distributed training, memory efficiency, or compute efficiency
Writing optimized NVIDIA GPU kernels or communication collectives using NVIDIA’s networking stack (e.g., NCCL or NVSHMEM)
Running large-scale experiments on GPU clusters

Benefits

Health insurance
Startup equity
Flexibility in terms of remote work

Research Scientist, Large-Scale Learning

Together AI

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

Data

Mid-level

Share this job:

Similar Remote Jobs

Remote

Data

Mid-level

Pangea

Remote

Cybersecurity

Mid-level

Remote

Data

Mid-level

IntelliPro

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

All Others

Mid-level