User Tools

Site Tools


self_supervised_learning_word_embeddings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
self_supervised_learning_word_embeddings [2018/11/28 00:16]
burkov
self_supervised_learning_word_embeddings [2023/12/25 04:15]
135.23.195.80 [Recommended Reading]
Line 8: Line 8:
  
   * [[https://​arxiv.org/​pdf/​1411.2738v3.pdf|word2vec Parameter Learning Explained]] by Xin Rong (2016)   * [[https://​arxiv.org/​pdf/​1411.2738v3.pdf|word2vec Parameter Learning Explained]] by Xin Rong (2016)
-  * [[http://​rohanvarma.me/​Word2Vec/​|Language Models, Word2Vec, and Efficient Softmax Approximations]] by  +  * [[http://​rohanvarma.me/​Word2Vec/​|Language Models, Word2Vec, and Efficient Softmax Approximations]] by Rohan Varma (2017) 
-Rohan Varma (2017)+  * [[https://​arxiv.org/​abs/​1706.03762|Attention Is All You Need]] by Vaswani et al. (2017), a state-of-the-art sequence-to-sequence model, plus an [[http://​jalammar.github.io/​illustrated-transformer/​|illustrated guide]] plus an [[http://​nlp.seas.harvard.edu/​annotated-transformer/​|annotated paper with code]]. 
 +  * [[https://​arxiv.org/​abs/​1810.04805|BERT:​ Pre-training of Deep Bidirectional Transformers for Language Understanding]] by Devlin et all (2018) 
 +  * [[https://​s3-us-west-2.amazonaws.com/​openai-assets/​research-covers/​language-unsupervised/​language_understanding_paper.pdf|Improving Language Understanding by Generative Pre-Training]] by Radford et all (2018) (the GPT paper) 
 +  * [[https://​arxiv.org/​abs/​2005.14165|Language Models are Few-Shot Learners]] by Brown et all (2020) (the GPT-3 paper) 
 +  * [[https://​arxiv.org/​abs/​2205.11916|Large Language Models are Zero-Shot Reasoners]] by Kojima et al. (2022) (the chain of thought (CoT) prompting paper). 
 +  * [[https://​arxiv.org/​abs/​2203.15556|Training Compute-Optimal Large Language Models]] by Hoffmann et al. (2022), (the Chinchilla paper). 
 +  * [[https://​sebastianraschka.com/​blog/​2023/​llm-reading-list.html|Understanding Large Language Models]] by Sebastian Raschka.