User Tools

Site Tools


recurrent_neural_network

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
recurrent_neural_network [2018/11/02 18:03]
38.84.140.72 created
recurrent_neural_network [2023/12/24 23:47] (current)
135.23.195.80 [Recommended Reading]
Line 1: Line 1:
-====== ​Recurrent Neural Network ​======+{{keywords>​Recurrent Neural Network}} 
 + 
 +<title classes #id> 
 +Recurrent Neural Network 
 +</​title>​
  
 ===== Recommended Reading ===== ===== Recommended Reading =====
  
 +  * [[https://​www.dropbox.com/​s/​ouj8ddydc77tewo/​ExtendedChapter6.pdf?​dl=0|An extended version of Chapter 6 with RNN unfolding and bidirectional RNN]]
   * [[http://​karpathy.github.io/​2015/​05/​21/​rnn-effectiveness/​|The Unreasonable Effectiveness of Recurrent Neural Networks]] by Andrej Karpathy (2015)   * [[http://​karpathy.github.io/​2015/​05/​21/​rnn-effectiveness/​|The Unreasonable Effectiveness of Recurrent Neural Networks]] by Andrej Karpathy (2015)
   * [[https://​towardsdatascience.com/​recurrent-neural-networks-and-lstm-4b601dd822a5|Recurrent Neural Networks and LSTM]] by Niklas Donges (2018)   * [[https://​towardsdatascience.com/​recurrent-neural-networks-and-lstm-4b601dd822a5|Recurrent Neural Networks and LSTM]] by Niklas Donges (2018)
Line 10: Line 15:
   * [[http://​www.wildml.com/​2015/​10/​recurrent-neural-networks-tutorial-part-3-backpropagation-through-time-and-vanishing-gradients/​|Backpropagation Through Time and Vanishing Gradients]] by Denny Britz (2015)   * [[http://​www.wildml.com/​2015/​10/​recurrent-neural-networks-tutorial-part-3-backpropagation-through-time-and-vanishing-gradients/​|Backpropagation Through Time and Vanishing Gradients]] by Denny Britz (2015)
   * [[http://​www.wildml.com/​2015/​10/​recurrent-neural-network-tutorial-part-4-implementing-a-grulstm-rnn-with-python-and-theano/​|Implementing a GRU/LSTM RNN with Python and Theano]] by Denny Britz (2015)   * [[http://​www.wildml.com/​2015/​10/​recurrent-neural-network-tutorial-part-4-implementing-a-grulstm-rnn-with-python-and-theano/​|Implementing a GRU/LSTM RNN with Python and Theano]] by Denny Britz (2015)
 +  * [[https://​arxiv.org/​abs/​1701.03452|Simplified Minimal Gated Unit Variations for Recurrent Neural Networks]] by Joel Heck and Fathi Salem (2017) 
 +  * [[https://​arxiv.org/​abs/​1706.03762|Attention Is All You Need]] by Vaswani et al. (2017), a state-of-the-art sequence-to-sequence model, plus an [[http://​jalammar.github.io/​illustrated-transformer/​|illustrated guide]] plus an [[http://​nlp.seas.harvard.edu/​annotated-transformer/​|annotated paper with code]]. 
 +  * [[https://​arxiv.org/​abs/​2203.15556|Training Compute-Optimal Large Language Models]] by Hoffmann et al. (2022), (the Chinchilla paper). 
 +  * [[https://​sebastianraschka.com/​blog/​2023/​llm-reading-list.html|Understanding Large Language Models]] by Sebastian Raschka.