[Remote] Research Intern (LLM)
Note: The job is a remote job and is open to candidates in USA. Abaka AI is focused on advancing artificial intelligence research, and they are seeking a Research Intern to contribute to the development of challenging QA datasets and evaluate large language models. The role involves collaboration with global researchers and requires strong analytical and execution skills.
Responsibilities
- Design and construct high-quality, sufficiently challenging QA datasets (graduate/PhD level) inspired by GPQA, HLE, and AI4Sci families, collaborating with a global network of talented researchers
- Evaluate large language models on reasoning, factuality, and problem-solving benchmarks
- Develop review pipelines and quality-control criteria for expert-level question generation
- Analyze model outputs, conduct error taxonomy studies, and summarize insights for internal reports and research papers
- Collaborate with the 2077AI Foundation’s open-source benchmark teams on public dataset releases
Skills
- Strong background in computer science, data engineering, artificial intelligence, or related fields, with hands-on experience in large-scale data systems
- 1+ years of experience with LLMs, prompt engineering, and evaluation frameworks (e.g., LM Eval Harness, OpenCompass)
- Excellent written and verbal English skills and analytical reasoning
- Strong execution and team management skills—able to translate high-level objectives into actionable plans and drive team outcomes
- Experience with formal methods, chain-of-thought evaluation, or curriculum generation
- Relevant publications in top conferences
Company Overview
Company H1B Sponsorship
Apply To This Job