| Name | Last modified | Size | Description |
| Parent Directory | | - | |
| also/ | 2024-09-05 00:59 | - | |
| knowledge/ | 2022-10-10 22:13 | - | |
| Bahdanau-2015.pdf | 2015-07-01 00:00 | 434K | Attention |
| Bengio-2003.pdf | 2003-07-01 00:00 | 137K | Language Models |
| Bottou-1990.pdf | 1990-07-01 00:00 | 1.4M | Modular Models |
| Bottou-1991.pdf | 1991-07-01 00:00 | 128K | Stochastic Gradient Descent (SGD) |
| Collobert-2011.pdf | 2011-07-01 00:00 | 415K | Language Models |
| Cybenko-1989.pdf | 1989-07-01 00:00 | 598K | Universal approximation |
| Devlin-2019.pdf | 2019-07-01 00:00 | 757K | Bidirectional Encoder Representation Transformer (BERT) |
| Dinh-2015.pdf | 2015-07-01 00:00 | 1.6M | Normalizing Flows, Invertible Neural Networks (INN) |
| Elman-1990.pdf | 1990-07-01 00:00 | 2.8M | Recurrent Networks (RNN) |
| Girshick-2014.pdf | 2014-07-01 00:00 | 6.2M | Object Localization, R-CNN |
| Glorot-2010.pdf | 2010-07-01 00:00 | 1.6M | Rectified Linear Units (ReLU), Xavier Initialization |
| Goodfellow-2014.pdf | 2014-07-01 00:00 | 518K | Generative Adversarial Networks (GAN) |
| He-2015.pdf | 2015-07-01 00:00 | 800K | Residual Networks (ResNet), Kaiming Initialization |
| Hinton-1981.pdf | 1981-07-01 00:00 | 2.0M | Parallel Distributed Processing (PDP) |
| Hinton-2005.pdf | 2005-07-01 00:00 | 827K | Deep Belief Networks |
| Hinton-2006.pdf | 2006-07-01 00:00 | 361K | Restricted Bolzmann Machines (RBM) |
| Hochreiter-1997.pdf | 1997-07-01 00:00 | 388K | Long Short-Term Memory (LSTM) |
| HubelWiesel-1959.pdf | 1959-07-01 00:00 | 1.8M | Biological Vision |
| Ioffe-2015.pdf | 2015-07-01 00:00 | 169K | Batch Normalization (Batchnorm) |
| Isola-2017.pdf | 2017-07-01 00:00 | 1.9M | Image-to-Image translation (Pix2Pix) |
| Kaplan-2020.pdf | 2020-07-01 00:00 | 2.4M | |
| Kingma-2013.pdf | 2013-07-01 00:00 | 3.7M | Variational Autoencoders (VAE) |
| Kingma-2015.pdf | 2015-07-01 00:00 | 571K | The ADAM optimizer |
| Kipf-2017.pdf | 2017-07-01 00:00 | 853K | Graph Neural Networks (GNN) |
| Kohonen-1972.pdf | 1972-07-01 00:00 | 1.1M | Associative Memory |
| Krizhevsky-2012.pdf | 2012-07-01 00:00 | 1.4M | AlexNet, GPU, ImageNet, Dropout |
| Krogh-1991.pdf | 1991-07-01 00:00 | 1.5M | Weight Decay |
| LeCun-1989.pdf | 1989-07-01 00:00 | 1.9M | Convolutional Networks (CNN) |
| Lettvin-1959.pdf | 1959-07-01 00:00 | 2.5M | Frog Neuron Codes |
| McCullochPitts-1943.pdf | 1943-07-01 00:00 | 1.2M | Logical Neurons |
| Meng-2022.pdf | 2023-09-07 11:26 | 9.9M | |
| Mikolov-2013.pdf | 2013-07-01 00:00 | 109K | Word2Vec, Semantic Vector Composition |
| Mildenhall-2020.pdf | 2020-07-01 00:00 | 7.9M | Neural Radiance Fields (NeRF) |
| Minh-2013.pdf | 2013-07-01 00:00 | 472K | Atari, Deep Reinforcement Learning |
| Minsky-1969.pdf | 1969-07-01 00:00 | 1.0M | Limitations of Perceptrons |
| Quiroga-2005.pdf | 2005-07-01 00:00 | 403K | Single Neurons in Humans |
| Radford-2018.pdf | 2018-07-01 00:00 | 569K | Autoregressive Generative Pretrained Transformer (GPT) |
| Radford-2021.pdf | 2021-07-01 00:00 | 6.5M | Contrastive Language-Image representation (CLIP) |
| Ramesh-2022.pdf | 2022-07-01 00:00 | 41M | Diffusion model Text-to-Image Generation (DALL-E 2) |
| Rosenblatt-1962.pdf | 1962-07-01 00:00 | 17M | Perceptron Learning |
| Rumelhart-1986.pdf | 1986-07-01 00:00 | 1.3M | Backpropagation |
| Scarselli-2009.pdf | 2009-07-01 00:00 | 1.4M | Graph Neural Networks (GNN) |
| Senior-2020.pdf | 2020-07-01 00:00 | 7.6M | AlphaFold, Deep Learning Computational Chemistry |
| Silver-2016.pdf | 2016-07-01 00:00 | 1.6M | AlphaGo, Neural Tree Search |
| Sohl-Dickstein-2015.pdf | 2015-07-01 00:00 | 4.2M | Diffusion Models |
| Solla-1988.pdf | 1988-07-01 00:00 | 2.4M | Cross Entropy Loss |
| Sutskever-2014.pdf | 2014-07-01 00:00 | 140K | Neural Machine Translation, Seq2Seq |
| Szegedy-2013.pdf | 2013-07-01 00:00 | 6.3M | Adversarial Examples |
| Vaswani-2017.pdf | 2017-07-01 00:00 | 2.1M | Attention is All You Need: Transformers |
| Vincent-2010.pdf | 2010-07-01 00:00 | 1.3M | Denoising Autoencoders (DAE) |
| Vinyals-2015.pdf | 2015-07-01 00:00 | 752K | Show and Tell, Image Captioning |
| Yosinksi-2014.pdf | 2014-07-01 00:00 | 481K | Transfer Learning |
| Zeiler-2014.pdf | 2014-07-01 00:00 | 35M | Visualization, Salience Maps |
| Zhang-2016.pdf | 2016-07-01 00:00 | 394K | Rethinking Generalization, Memorization |