Differences

This shows you the differences between two versions of the page.

--- topics_not_covered [2024/01/27 23:31]
burkov [Neural Network Best Practices and Tricks]
+++ topics_not_covered [2025/06/27 15:57] (current)
157.48.252.200
@@ Line 17: / Line 17: @@
   * [[https://arxiv.org/abs/2203.02155|Training language models to follow instructions with human feedback]] by Ouyang et all (2022) (The InstructGPT paper)
+  * [[https://blog.matdmiller.com/posts/2023-06-10_transformers/notebook.html|Transformers From Scratch]] by Mat Miller (2024)
 ===== Neural Network Best Practices and Tricks =====
@@ Line 22: / Line 23: @@
   * [[https://github.com/huggingface/peft|PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods]] by Hugging Face
   * [[https://www.philschmid.de/fine-tune-llms-in-2024-with-trl|How to Fine-Tune LLMs in 2024 with Hugging Face]] by Philipp Schmid (2024)
+  * [[https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54|Merge Large Language Models with mergekit]] by Maxime Labonne (2024)

The Hundred-Page Machine Learning Book