Reinforcement Learning Engineer

Logo of poolside

poolside

πŸ“Remote - United States

Job highlights

Summary

Join poolside, a company pushing the boundaries of AI, and become part of our reinforcement learning team. We're focused on enhancing the reasoning and coding abilities of Large Language Models (LLMs) through reinforcement learning. This hands-on role involves the entire process, from researching new algorithms to implementing your ideas. You'll have access to thousands of GPUs and collaborate closely with a team of experts. Your mission is to advance the capabilities of foundational models. We offer a remote-first work environment with flexible hours and a comprehensive benefits package.

Requirements

  • Experience with Large Language Models (LLM)
  • Deep knowledge of Transformers is a must
  • Strong deep learning fundamentals
  • Trained and fine-tuned LLMs from scratch
  • Extensively used and probed LLMs, familiarity of their capabilities and limitations
  • Knowledge/Experience of distributed training
  • Strong machine learning and engineering background
  • Research experience
  • Experience in proposing and evaluating novel research ideas
  • Familiar with, or contributed to the state of the art in at least one of the topics: LLMs, reinforcement learning, source code generation, continual learning
  • Is comfortable in a rapidly iterating environment
  • Is reasonably opinionated
  • Programming experience
  • Linux
  • Strong algorithmic skills
  • Python with PyTorch or Jax
  • Use modern tools and are always looking to improve
  • Strong critical thinking and ability to question code quality policies when applicable

Responsibilities

  • Research and experiment on ways to improve reasoning and code generation for LLMs
  • Own the full experiment life cycle from idea to experimentation and integration
  • Keep up with latest research, and be familiar with state of the art in LLMs, RL, and code generation
  • Design, analyze, and iterate on training/fine-tuning/data generation experiments
  • Write high-quality, pragmatic code
  • Work in the team: plan future steps, discuss, and always stay in touch

Preferred Qualifications

  • Recent academic publications are nice to have
  • Prior experience in non-ML programming, especially not in Python - is a nice to have

Benefits

  • Fully remote work & flexible hours
  • 37 days/year of vacation & holidays
  • Health insurance allowance for you and dependents
  • Company-provided equipment
  • Wellbeing, always-be-learning and home office allowances
  • Frequent team get togethers
  • Great diverse & inclusive people-first culture

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs