Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348)

* feat(doc): organize docs, add to menu bar, fix broken formatting

* feat: add link to custom integrations

* feat: update readme for integrations to include citations and repo link

* chore: update lm_eval info

* chore: use fullname

* Update docs/cli.qmd per suggestion

Co-authored-by: Dan Saunders <danjsaund@gmail.com>

* feat: add sweep doc

* feat: add kd doc

* fix: remove toc

* fix: update deprecation

* feat: add more info about chat_template issues

* fix: heading level

* fix: shell->bash code block

* fix: ray link

* fix(doc): heading level, header links, formatting

* feat: add grpo docs

* feat: add style changes

* fix: wrong cli arg for lm-eval

* fix: remove old run method

* feat: load custom integration doc dynamically

* fix: remove old cli way

* fix: toc

* fix: minor formatting

---------

Co-authored-by: Dan Saunders <danjsaund@gmail.com>
This commit is contained in:
NanoCode012
2025-02-25 16:09:37 +07:00
committed by GitHub
parent 1110a37e21
commit 2efe1b4c09
32 changed files with 940 additions and 443 deletions

View File

@@ -1,6 +1,10 @@
# Cut Cross Entropy
### Usage
Cut Cross Entropy reduces VRAM usage through optimization on the cross-entropy operation during loss calculation.
See https://github.com/apple/ml-cross-entropy
## Usage
```yaml
plugins:
@@ -8,3 +12,19 @@ plugins:
cut_cross_entropy: true
```
## Citation
```bib
@article{wijmans2024cut,
author = {Erik Wijmans and
Brody Huval and
Alexander Hertzberg and
Vladlen Koltun and
Philipp Kr\"ahenb\"uhl},
title = {Cut Your Losses in Large-Vocabulary Language Models},
journal = {arXiv},
year = {2024},
url = {https://arxiv.org/abs/2411.09009},
}
```