Browsing: Yannic Kilcher
Links: Homepage: Merch: YouTube: Twitter: Discord: LinkedIn: If you want to support me, the best thing to do is to…
Meta’s Llama 3 is out. New model, new license, new opportunities. References: Links: Homepage: Merch: YouTube: Twitter: Discord: LinkedIn: If…
Google researchers achieve supposedly infinite context attention via compressive memory. Paper: Abstract: This work introduces an efficient method to scale…
OUTLINE: 0:00 – Intro 0:21 – Debunking Devin: “First AI Software Engineer” Upwork lie exposed! 07:24 – NeurIPS 2024 will…
Paper: Abstract: While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs.…
OUTLINE: 0:00 – Intro 0:19 – Our next-generation Meta Training and Inference Accelerator 01:39 – ALOHA Unleashed 03:10 – Apple…
Paper: Abstract: While recent preference alignment algorithms for language models have demonstrated promising results, supervised fine-tuning (SFT) remains imperative for…
#gpt4o #sky #scarlettjohansson After the release of their flagship model GPT-4o, OpenAI finds itself in multiple controversies and an exodus…
xLSTM is an architecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training of transformers…
#rag #hallucinations #legaltech An in-depth look at a recent Stanford paper examining the degree of hallucinations in various LegalTech tools…