Andrej Karpathy Releases Nanochat, A Minimal ChatGPT Clone

OpenAI co-founder and Eureka Labs founder, Andrej Karpathy, has released nanochat, an open-source project that provides a full-stack training and inference pipeline for a simple ChatGPT-style model. The repository follows his earlier project, nanoGPT, which focused only on pretraining.

Link to the GitHub repository.

In a post on X, Karpathy said, “You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.”

The repo consists of about 8,000 lines of code and covers the entire pipeline. It includes tokeniser training in Rust and pretraining a Transformer LLM on FineWeb. The pipeline also handles mid-training on user-assistant conversations and multiple-choice questions, supervised fine-tuning (SFT), and optional reinforcement learning (RL) with GRPO. Finally, it supports efficient inference with KV caching.

Users can interact with the model through a command-line interface or a web UI, and the system generates a markdown report summarising performance.

Karpathy explained that the models can be trained at different scales depending on time and cost. A small ChatGPT clone can be trained for around $100 in roughly 4 hours on an 8×H100 GPU node, allowing basic interaction.

Training for about 12 hours enables the model to surpass the GPT-2 CORE benchmark. Scaling up to approximately $1,000, or around 42 hours of training, produces a model that is more coherent and capable of solving simple math and coding problems, as well as answering multiple-choice questions.

“My goal is to get the full ‘strong baseline’ stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed),” Karpathy said. LLM101n is an undergraduate-level class at Eureka Labs that will guide students through the process of building their own AI model. Karpathy also added that the project could grow into a research harness or benchmark, similar to nanoGPT.

Source link

What's Hot

Look To A 10-Year Horizon For The Real Impact Of AI

AI video models could be next big copyright battle | MLex

HR Uses AI To Redefine Strategies

Andrej Karpathy Releases nanochat, a Minimal ChatGPT Clone

Karpathy Critiques LLMs’ Fear of Code Exceptions in RLHF Training

This vibe coding app develops SwiftUI apps right on your iPhone

Why AI ROI Continues To Be Elusive Despite Broad Adoption – RamaOnHealthcare

Artist Behind Canterbury Cathedral Art Responds to JD Vance, Elon Musk

Jenkins Johnson Gallery to Open Tribeca Outpost on Marian Goodman Gallery’s Third Floor

Ruth Asawa May Have Broken Record at MoMA—and More Art News

Toledo Museum of Art Director on Digital Art, AI, and Future-Proofing

Look To A 10-Year Horizon For The Real Impact Of AI

AI video models could be next big copyright battle | MLex

HR Uses AI To Redefine Strategies

What's Hot

Andrej Karpathy Releases nanochat, a Minimal ChatGPT Clone

Related Posts

Subscribe to Updates