Summary

Join poolside, a company pushing the boundaries of AI, and become part of our reinforcement learning team. We're focused on enhancing the reasoning and coding abilities of Large Language Models (LLMs) through reinforcement learning. This hands-on role involves the entire process, from researching new algorithms to implementing your ideas. You'll have access to thousands of GPUs and collaborate closely with a team of experts. Your mission is to advance the capabilities of foundational models. We offer a remote-first work environment with flexible hours and a comprehensive benefits package.

Requirements

Experience with Large Language Models (LLM)
Deep knowledge of Transformers is a must
Strong deep learning fundamentals
Trained and fine-tuned LLMs from scratch
Extensively used and probed LLMs, familiarity of their capabilities and limitations
Knowledge/Experience of distributed training
Strong machine learning and engineering background
Research experience
Experience in proposing and evaluating novel research ideas
Familiar with, or contributed to the state of the art in at least one of the topics: LLMs, reinforcement learning, source code generation, continual learning
Is comfortable in a rapidly iterating environment
Is reasonably opinionated
Programming experience
Linux
Strong algorithmic skills
Python with PyTorch or Jax
Use modern tools and are always looking to improve
Strong critical thinking and ability to question code quality policies when applicable

Responsibilities

Research and experiment on ways to improve reasoning and code generation for LLMs
Own the full experiment life cycle from idea to experimentation and integration
Keep up with latest research, and be familiar with state of the art in LLMs, RL, and code generation
Design, analyze, and iterate on training/fine-tuning/data generation experiments
Write high-quality, pragmatic code
Work in the team: plan future steps, discuss, and always stay in touch

Preferred Qualifications

Recent academic publications are nice to have
Prior experience in non-ML programming, especially not in Python - is a nice to have

Benefits

Fully remote work & flexible hours
37 days/year of vacation & holidays
Health insurance allowance for you and dependents
Company-provided equipment
Wellbeing, always-be-learning and home office allowances
Frequent team get togethers
Great diverse & inclusive people-first culture

Reinforcement Learning Engineer

poolside

Job highlights

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

Software Development

Mid-level

Share this job:

Similar Remote Jobs

Niche

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Mid-level

Remote

Software Development

Senior

Federato

Remote

Software Development

Senior

Federato

Remote

Software Development

Mid-level

SoundHound AI

Remote

Software Development

Senior

Remote

Software Development

Mid-level

Remote

Data

Mid-level