Browsing: Yannic Kilcher
#ai #research #blockchain Big Tech is currently dominating the pursuit of ever more capable AI. This happens behind closed doors…
#ai #science #transformers Autoregressive Transformers have taken over the world of Language Modeling (GPT-3). However, in order to train them,…
#deeplearning #kernels #neuralnetworks Full Title: Every Model Learned by Gradient Descent Is Approximately a Kernel Machine Deep Neural Networks are…
#transformer #nystromer #nystromformer The Nyströmformer (or Nystromformer, Nyströmer, Nystromer), is a new drop-in replacement for approximating the Self-Attention matrix in…
#nfnets #deepmind #machinelearning Batch Normalization is a core component of modern deep learning. It enables training at higher batch sizes,…
#transformer #gan #machinelearning Generative Adversarial Networks (GANs) hold the state-of-the-art when it comes to image generation. However, while the rest…
#dreamer #deeprl #reinforcementlearning Model-Based Reinforcement Learning has been lagging behind Model-Free RL on Atari, especially among single-GPU algorithms. This collaboration…
#deberta #bert #huggingface DeBERTa by Microsoft is the next iteration of BERT-style Self-Attention Transformer models, surpassing RoBERTa in State-of-the-art in…
#fastweights #deeplearning #transformers Transformers are dominating Deep Learning, but their quadratic memory and compute requirements make them expensive to train…
#glom #hinton #capsules Geoffrey Hinton describes GLOM, a Computer Vision model that combines transformers, neural fields, contrastive learning, capsule networks,…