This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Last revision Both sides next revision | ||
self_supervised_learning_word_embeddings [2023/12/25 04:11] 135.23.195.80 [Recommended Reading] |
self_supervised_learning_word_embeddings [2023/12/25 04:15] 135.23.195.80 [Recommended Reading] |
||
---|---|---|---|
Line 12: | Line 12: | ||
* [[https://arxiv.org/abs/1810.04805|BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]] by Devlin et all (2018) | * [[https://arxiv.org/abs/1810.04805|BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]] by Devlin et all (2018) | ||
* [[https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf|Improving Language Understanding by Generative Pre-Training]] by Radford et all (2018) (the GPT paper) | * [[https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf|Improving Language Understanding by Generative Pre-Training]] by Radford et all (2018) (the GPT paper) | ||
- | * [[https://arxiv.org/abs/2205.11916|Large Language Models are Zero-Shot Reasoners]] by Kojima et al. (2022). | + | * [[https://arxiv.org/abs/2005.14165|Language Models are Few-Shot Learners]] by Brown et all (2020) (the GPT-3 paper) |
+ | * [[https://arxiv.org/abs/2205.11916|Large Language Models are Zero-Shot Reasoners]] by Kojima et al. (2022) (the chain of thought (CoT) prompting paper). | ||
* [[https://arxiv.org/abs/2203.15556|Training Compute-Optimal Large Language Models]] by Hoffmann et al. (2022), (the Chinchilla paper). | * [[https://arxiv.org/abs/2203.15556|Training Compute-Optimal Large Language Models]] by Hoffmann et al. (2022), (the Chinchilla paper). | ||
* [[https://sebastianraschka.com/blog/2023/llm-reading-list.html|Understanding Large Language Models]] by Sebastian Raschka. | * [[https://sebastianraschka.com/blog/2023/llm-reading-list.html|Understanding Large Language Models]] by Sebastian Raschka. |