Paper Page - OLMoTrace: Tracing Language Model Outputs Back To Trillions Of Training Tokens

We release OLMoTrace, a tool that lets you trace the outputs of language models back to their full, multi-trillion-token training data in real time. We developed OLMoTrace to raise transparency & trust in LLMs.

On top of a standard chatbot experience, OLMoTrace highlights long pieces of LLM outputs that appear verbatim in the model’s training data, and shows the matching training documents. With OLMoTrace, you can see how LLMs may have learned to generate certain sequences of tokens. OLMoTrace is useful for fact checking ✅, understanding hallucinations 🎃, tracing LLM-generated “creative” expressions 🧑‍🎨, tracing reasoning capabilities 🧮, or just generally helping you understand why LLMs say certain things.

OLMoTrace is now available for the OLMo 2 and OLMoE family of models on Ai2 Playground. We also open-source our code so that anyone can enable OLMoTrace with their model’s training data.

Paper: https://allenai.org/papers/olmotrace
Blog: https://allenai.org/blog/olmotrace
Try OLMoTrace on Ai2 Playground: https://playground.allenai.org

Source link

What's Hot

Baker’s Late Goal Lifts Johns Hopkins Over MIT, 10-9

OpenAI’s hardware push may include a Humane-like pin and smart wearables | Technology News

My Best Advice for Young People (as a millionaire)

Paper page – OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Research Paper – Takara TLDR

2D Gaussian Splatting with Semantic Alignment for Image Inpainting – Takara TLDR

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward – Takara TLDR

Acquavella Signs Harumi Klossowska de Rola, Daughter of Balthus

Heirs of Jewish Collector Urge Court to Reconsider Claim to Sunflowers

Art World Figures Remember Agnes Gund: ‘a Legend and Icon’

Bizarre Trump Bitcoin Statue Appears in Washington, D.C.

Baker’s Late Goal Lifts Johns Hopkins Over MIT, 10-9

OpenAI’s hardware push may include a Humane-like pin and smart wearables | Technology News

My Best Advice for Young People (as a millionaire)

What's Hot

Paper page – OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Related Posts

Subscribe to Updates