We are looking for a Research Engineer (Post-Training Evals) who can lead the design and implementation of evaluation systems that rigorously test, stress, and measure AI model performance post-training. This role is pivotal to ensuring that models not only perform, but align—ethically, functionally, and contextually.
What You’ll Do
- Develop and maintain automated evaluation pipelines for large language models across multiple benchmarks and custom metrics.
- Design novel evaluation frameworks for robustness, alignment, reasoning, factuality, safety, and bias.
- Analyze model outputs qualitatively and quantitatively to surface strengths, weaknesses, and edge cases.
- Build tooling to simulate real-world use cases and generate synthetic test datasets.
- Collaborate with research, alignment, and deployment teams to shape evaluation criteria and improve model reliability.
- Explore and implement scalable methods for both static and interactive model assessment.
- Publish findings, dashboards, and documentation to inform internal stakeholders and model developers.
Skills We’re Looking For
- Strong background in NLP, ML evaluation, or applied machine learning.
- Proficiency in Python, with experience using ML/NLP libraries like HuggingFace, PyTorch, or JAX.
- Familiarity with model evaluation frameworks, from traditional benchmarks to custom prompt-based tests.
- Experience working with LLMs, language tasks (e.g., QA, summarization, reasoning), and prompt engineering.
- Analytical rigor with the ability to communicate technical findings clearly and concisely.
- Comfortable navigating experimental environments with evolving research priorities.
We’d Love It If You’ve Had
- Experience contributing to LLM eval datasets, leaderboards, or internal benchmarks.
- Knowledge of human preference modeling, RLHF, or safety alignment methods.