mirror of
https://github.com/HackTricks-wiki/hacktricks.git
synced 2025-10-10 18:36:50 +00:00
f
This commit is contained in:
parent
fe60da06cf
commit
18e9ee8566
@ -297,4 +297,3 @@ During the backward pass:
|
||||
- **Efficiency:** Avoids redundant calculations by reusing intermediate results.
|
||||
- **Accuracy:** Provides exact derivatives up to machine precision.
|
||||
- **Ease of Use:** Eliminates manual computation of derivatives.
|
||||
|
||||
|
@ -96,4 +96,3 @@ print(token_ids[:50])
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
|
||||
|
||||
|
@ -238,4 +238,3 @@ tensor([[ 367, 2885, 1464, 1807],
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
|
||||
|
||||
|
@ -216,4 +216,3 @@ print(input_embeddings.shape) # torch.Size([8, 4, 256])
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
|
||||
|
||||
|
@ -427,4 +427,3 @@ For another compact and efficient implementation you could use the [`torch.nn.Mu
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
|
||||
|
||||
|
@ -697,4 +697,4 @@ print("Output length:", len(out[0]))
|
||||
|
||||
## References
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
@ -968,4 +968,3 @@ There 2 quick scripts to load the GPT2 weights locally. For both you can clone t
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
|
||||
|
||||
|
@ -60,4 +60,4 @@ def replace_linear_with_lora(model, rank, alpha):
|
||||
|
||||
## References
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
@ -113,4 +113,4 @@ You can find all the code to fine-tune GPT2 to be a spam classifier in [https://
|
||||
|
||||
## References
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
@ -103,4 +103,4 @@ You can find an example of the code to perform this fine tuning in [https://gith
|
||||
|
||||
## References
|
||||
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
- [https://www.manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
@ -96,4 +96,3 @@ You should start by reading this post for some basic concepts you should know ab
|
||||
{{#ref}}
|
||||
7.2.-fine-tuning-to-follow-instructions.md
|
||||
{{#endref}}
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user