e

2025-10-10 18:36:50 +00:00 · 2025-06-08 19:15:04 +02:00 · 2025-06-08 19:15:04 +02:00 · fe60da06cf
commit fe60da06cf
parent 21219807a8
10 changed files with 3 additions and 7 deletions
--- a/src/AI/AI-llm-architecture/1.-tokenizing.md
+++ b/src/AI/AI-llm-architecture/1.-tokenizing.md
@ -96,3 +96,4 @@ print(token_ids[:50])

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)

+
--- a/src/AI/AI-llm-architecture/2.-data-sampling.md
+++ b/src/AI/AI-llm-architecture/2.-data-sampling.md
@ -238,3 +238,4 @@ tensor([[  367,  2885,  1464,  1807],

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)

+
--- a/src/AI/AI-llm-architecture/3.-token-embeddings.md
+++ b/src/AI/AI-llm-architecture/3.-token-embeddings.md
@ -216,3 +216,4 @@ print(input_embeddings.shape) # torch.Size([8, 4, 256])

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)

+
--- a/src/AI/AI-llm-architecture/4.-attention-mechanisms.md
+++ b/src/AI/AI-llm-architecture/4.-attention-mechanisms.md
@ -428,4 +428,3 @@ For another compact and efficient implementation you could use the [`torch.nn.Mu
 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)


-
--- a/src/AI/AI-llm-architecture/5.-llm-architecture.md
+++ b/src/AI/AI-llm-architecture/5.-llm-architecture.md
@ -698,4 +698,3 @@ print("Output length:", len(out[0]))
 ## References

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
-
--- a/src/AI/AI-llm-architecture/6.-pre-training-and-loading-models.md
+++ b/src/AI/AI-llm-architecture/6.-pre-training-and-loading-models.md
@ -969,4 +969,3 @@ There 2 quick scripts to load the GPT2 weights locally. For both you can clone t
 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)


-
--- a/src/AI/AI-llm-architecture/7.0.-lora-improvements-in-fine-tuning.md
+++ b/src/AI/AI-llm-architecture/7.0.-lora-improvements-in-fine-tuning.md
@ -61,4 +61,3 @@ def replace_linear_with_lora(model, rank, alpha):
 ## References

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
-
--- a/src/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification.md
+++ b/src/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification.md
@ -114,4 +114,3 @@ You can find all the code to fine-tune GPT2 to be a spam classifier in [https://
 ## References

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
-
--- a/src/AI/AI-llm-architecture/7.2.-fine-tuning-to-follow-instructions.md
+++ b/src/AI/AI-llm-architecture/7.2.-fine-tuning-to-follow-instructions.md
@ -104,4 +104,3 @@ You can find an example of the code to perform this fine tuning in [https://gith
 ## References

 - [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
-
--- a/src/AI/AI-llm-architecture/README.md
+++ b/src/AI/AI-llm-architecture/README.md
@ -97,4 +97,3 @@ You should start by reading this post for some basic concepts you should know ab
 7.2.-fine-tuning-to-follow-instructions.md
 {{#endref}}

-
				`@ -96,3 +96,4 @@ print(token_ids[:50])`

				`- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)`
				`@ -238,3 +238,4 @@ tensor([[ 367, 2885, 1464, 1807],`

				`- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)`
				`@ -216,3 +216,4 @@ print(input_embeddings.shape) # torch.Size([8, 4, 256])`

				`- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)`
				@ -428,4 +428,3 @@ For another compact and efficient implementation you could use the [`torch.nn.Mu
				`- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)`
				`@ -969,4 +969,3 @@ There 2 quick scripts to load the GPT2 weights locally. For both you can clone t`
				`- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)`