This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
topics_not_covered [2024/01/27 23:31] burkov [Neural Network Best Practices and Tricks] |
topics_not_covered [2025/06/27 15:57] (current) 157.48.252.200 |
||
|---|---|---|---|
| Line 17: | Line 17: | ||
| * [[https://arxiv.org/abs/2203.02155|Training language models to follow instructions with human feedback]] by Ouyang et all (2022) (The InstructGPT paper) | * [[https://arxiv.org/abs/2203.02155|Training language models to follow instructions with human feedback]] by Ouyang et all (2022) (The InstructGPT paper) | ||
| + | * [[https://blog.matdmiller.com/posts/2023-06-10_transformers/notebook.html|Transformers From Scratch]] by Mat Miller (2024) | ||
| ===== Neural Network Best Practices and Tricks ===== | ===== Neural Network Best Practices and Tricks ===== | ||
| Line 22: | Line 23: | ||
| * [[https://github.com/huggingface/peft|PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods]] by Hugging Face | * [[https://github.com/huggingface/peft|PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods]] by Hugging Face | ||
| * [[https://www.philschmid.de/fine-tune-llms-in-2024-with-trl|How to Fine-Tune LLMs in 2024 with Hugging Face]] by Philipp Schmid (2024) | * [[https://www.philschmid.de/fine-tune-llms-in-2024-with-trl|How to Fine-Tune LLMs in 2024 with Hugging Face]] by Philipp Schmid (2024) | ||
| + | * [[https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54|Merge Large Language Models with mergekit]] by Maxime Labonne (2024) | ||