Back to Jobs

[Remote] Student Researcher [LLM Post Training – Agent & Reinforcement Learning] - 2026 Start (PhD)

Remote, USA Full-time Posted 2025-11-24

Note: The job is a remote job and is open to candidates in USA. ByteDance is dedicated to pioneering advanced AI foundation models and is seeking a Student Researcher for their Seed LLM Post Training team. The role involves researching and developing advanced technologies in reinforcement learning and agent capabilities.


Responsibilities

  • Develop generalized agents capable of solving complex real-world tasks through long-horizon reasoning, memory, and multi-turn interaction
  • Tackle the challenges of large-scale reinforcement learning, building systems that can scale across compute, data, and environments to improve model intelligence and alignment with human preferences
  • Advance agent capabilities in long-horizon, multi-step reasoning across diverse domains, aiming to match or surpass expert-level performance
  • Explore planning, tool use, and feedback mechanisms to enhance agent robustness and adaptability across domains

Skills

  • Currently pursuing a PhD in Computer Science, AI, or a related field
  • Research experience in reinforcement learning, sequential decision-making, or agent behavior
  • First-author publications in top-tier ML/AI conferences (e.g., NeurIPS, ICLR, ICML)
  • Solid programming and experimentation skills, including with RL or LLM frameworks
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
  • Experience with LLM agents, tool use, or prompt-based control
  • Familiarity with environments such as WebArena, ALFWorld, or programmatic reasoning tasks
  • Understanding of RL techniques such as reward shaping, memory augmentation, or curriculum learning

Benefits

  • Interns have day one access to health insurance
  • Life insurance
  • Wellbeing benefits and more
  • Interns also receive 10 paid holidays per year
  • Paid sick time (56 hours if hired in first half of year, 40 if hired in second half of year)
  • Interns who are not working 100% remote may also be eligible for housing allowance

Company Overview

  • ByteDance is a technology company that develops content creation platforms and services. It was founded in 2012, and is headquartered in Beijing, Beijing, CHN, with a workforce of 10001+ employees. Its website is http://bytedance.com.

  • Company H1B Sponsorship

  • ByteDance has a track record of offering H1B sponsorships, with 1350 in 2025, 1123 in 2024, 775 in 2023, 487 in 2022, 417 in 2021, 245 in 2020. Please note that this does not guarantee sponsorship for this specific role.

  •   Apply To This Job

    Similar Jobs

    Experienced Part-Time Data Entry Specialist – Remote Work Opportunity with arenaflex for Organized and Detail-Oriented Individuals

    Remote, USA Full-time

    Experienced Remote Customer Service Agent – Delivering Exceptional Travel Experiences and World-Class Support to Passengers at arenaflex

    Remote, USA Full-time

    Remote Care Manager - RN 3 Locations

    Remote, USA Full-time

    Online Chat Representative

    Remote, USA Full-time

    Business Development Director, Commercial Enter...

    Remote, USA Full-time

    Data Entry Remote Jobs-JetBlue Airline At Home ...

    Remote, USA Full-time

    Senior Data Scientist - Revenue Intelligence

    Remote, USA Full-time

    Dog Walker / Dog Sitter

    Remote, USA Full-time

    Human Resources Global Services Specialist

    Remote, USA Full-time

    Strategic Accounts Executive - Ambulatory Surge...

    Remote, USA Full-time

    [Work From Home] Entry-Level Marketing Assistant

    Remote, USA Full-time

    Looking for Online English Tutor – Flexible Hours in Alexandria, LA

    Remote, USA Full-time

    ENTRY LEVEL CAREERS

    Remote, USA Full-time

    Retail Sales Specialist

    Remote, USA Full-time

    Seasonal Sales Associate (Rideau)

    Remote, USA Full-time

    Experienced Remote Data Entry Clerk – No Prior Experience Needed for Dynamic Data Management Role at Blithequark

    Remote, USA Full-time

    Manager- OneStream Architect

    Remote, USA Full-time

    CASHIER (full-time & part-time opportunities)

    Remote, USA Full-time

    Attorney – Client Intake & Lead Conversion (Remote – Florida)

    Remote, USA Full-time

    Actuarial Analyst – Insurance or Reinsurance – REMOTE

    Remote, USA Full-time