Research Engineer (Post-Training Evals)

We are looking for a Research Engineer (Post-Training Evals) who can lead the design and implementation of evaluation systems that rigorously test, stress, and measure AI model performance post-training. This role is pivotal to ensuring that models not only perform, but align—ethically, functionally, and contextually.

What You’ll Do

Develop and maintain automated evaluation pipelines for large language models across multiple benchmarks and custom metrics.
Design novel evaluation frameworks for robustness, alignment, reasoning, factuality, safety, and bias.
Analyze model outputs qualitatively and quantitatively to surface strengths, weaknesses, and edge cases.
Build tooling to simulate real-world use cases and generate synthetic test datasets.
Collaborate with research, alignment, and deployment teams to shape evaluation criteria and improve model reliability.
Explore and implement scalable methods for both static and interactive model assessment.
Publish findings, dashboards, and documentation to inform internal stakeholders and model developers.

Skills We’re Looking For

Strong background in NLP, ML evaluation, or applied machine learning.
Proficiency in Python, with experience using ML/NLP libraries like HuggingFace, PyTorch, or JAX.
Familiarity with model evaluation frameworks, from traditional benchmarks to custom prompt-based tests.
Experience working with LLMs, language tasks (e.g., QA, summarization, reasoning), and prompt engineering.
Analytical rigor with the ability to communicate technical findings clearly and concisely.
Comfortable navigating experimental environments with evolving research priorities.

We’d Love It If You’ve Had

Experience contributing to LLM eval datasets, leaderboards, or internal benchmarks.
Knowledge of human preference modeling, RLHF, or safety alignment methods.