Dan Saunders
d187f1f8e2
using field validator instead of model validator
2025-03-21 16:36:17 +00:00
Dan Saunders
1cced52719
rename file, delete another
2025-03-21 16:36:17 +00:00
Dan Saunders
11321b17e7
removing flash-attn from requirements.txt (in setup.py extras already)
2025-03-21 16:36:17 +00:00
Wing Lian
7a1a211c99
move ring flash attn to extras with flash-attn ( #2414 )
2025-03-21 16:36:17 +00:00
Dan Saunders
e1a02a32b5
fix
2025-03-21 16:36:17 +00:00
Dan Saunders
a6ef6c7764
fix
2025-03-21 16:36:17 +00:00
Dan Saunders
cb3a9e99a3
gracefully handle no ring-flash-attn
2025-03-21 16:36:17 +00:00
Dan Saunders
3ae47ec7de
actually isolate CLI tests
2025-03-21 16:36:17 +00:00
Dan Saunders
e36dc763ab
isolate cli tests
2025-03-21 16:36:17 +00:00
Dan Saunders
03027cf6bf
pernicious Fire CLI bugfix
2025-03-21 16:36:16 +00:00
Dan Saunders
0ade60d455
another import scoping change
2025-03-21 16:35:56 +00:00
Dan Saunders
02e1a42f04
scoping down problematic import
2025-03-21 16:35:56 +00:00
Dan Saunders
919b88f11b
update config.qmd and rename option
2025-03-21 16:35:55 +00:00
Dan Saunders
345a9dd831
removing some obvious comments
2025-03-21 16:35:38 +00:00
Dan Saunders
4ff97bc9d4
eval dataloader and sampler changes
2025-03-21 16:35:38 +00:00
Dan Saunders
d0e178d52f
remove debug logs and simplify
2025-03-21 16:35:38 +00:00
Dan Saunders
5731cdc0cf
fixing sample packing
2025-03-21 16:35:38 +00:00
Dan Saunders
b7738d57c4
working multi-group SP
2025-03-21 16:35:38 +00:00
Dan Saunders
698e599bf7
precommit fixes
2025-03-21 16:35:38 +00:00
Dan Saunders
1d339e4007
fixes
2025-03-21 16:35:38 +00:00
Dan Saunders
4190ad0647
updates
2025-03-21 16:35:36 +00:00
Dan Saunders
b44a207248
update
2025-03-21 16:35:10 +00:00
Dan Saunders
51c326150b
pytest
2025-03-21 16:35:10 +00:00
Dan Saunders
14baaf6e0a
updates
2025-03-21 16:35:10 +00:00
Dan Saunders
f487910444
removing unused code
2025-03-21 16:35:08 +00:00
Dan Saunders
c5071dfd8a
fix req
2025-03-21 16:34:12 +00:00
Dan Saunders
e323145ba9
remove errant file
2025-03-21 16:34:12 +00:00
Dan Saunders
7efc787ac8
cleanup
2025-03-21 16:34:12 +00:00
Dan Saunders
dce61cdab1
progress on ring attn impl
2025-03-21 16:34:12 +00:00
Dan Saunders
bd952de9d2
progress on ring attn impl
2025-03-21 16:34:10 +00:00
Dan Saunders
3f8a43cab6
adding easy_context as integration for now
2025-03-21 16:33:46 +00:00
Dan Saunders
113e9cd193
Autodoc generation with quartodoc ( #2419 )
...
* quartodoc integration
* quartodoc progress
* deletions
* Update docs/.gitignore to exclude auto-generated API documentation files
* Fix
* more autodoc progress
* moving reference up near the top of the sidebar
* fix broken link
* update to reflect recent changes
* pydantic models refactor + add to autodoc + fixes
* fix
* shrinking header sizes
* fix accidental change
* include quartodoc build step
* update pre-commit version
* update pylint
* pre-commit
---------
Co-authored-by: Dan Saunders <dan@axolotl.ai >
2025-03-21 12:26:47 -04:00
NanoCode012
61825a464a
chore(doc): add explanation on fsdp_transformer_layer_cls_to_wrap ( #2429 ) [skip ci]
2025-03-21 11:59:22 -04:00
Dan Saunders
c907ac173e
adding pre-commit auto-update GH action and bumping plugin versions ( #2428 )
...
* adding pre-commit auto-update GH action and bumping plugin versions
* running updated pre-commit plugins
* sorry to revert, but pylint complained
* Update .pre-commit-config.yaml
Co-authored-by: Wing Lian <wing.lian@gmail.com >
---------
Co-authored-by: Dan Saunders <dan@axolotl.ai >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2025-03-21 11:02:43 -04:00
salman
187227d837
Fixing KTO+QLoRA+multi-GPU ( #2420 )
...
* WIP
* removing artifacts
* adding error
* adding adapter check
* linting
* simplifying check
* linting v2
* config fix -___-
2025-03-21 10:18:28 -04:00
NanoCode012
f8de8bb4f2
chore(doc): add instructions on adding custom integrations ( #2422 ) [skip ci]
...
* chore(doc): add instructions on adding custom integrations
* chore: add warning help
* feat: add note about integration path
* fix: adjust text per suggestion
2025-03-21 10:18:01 -04:00
hugo
8e604848a4
add run on novita ai ( #2421 ) [skip ci]
...
* add run on novita ai
* Revert "add run on novita ai"
This reverts commit 4d5df1ac6b .
* add run axolotl on novita ai
2025-03-21 10:17:47 -04:00
Wing Lian
aae4337f40
add 12.8.1 cuda to the base matrix ( #2426 )
...
* add 12.8.1 cuda to the base matrix
* use nightly
* bump deepspeed and set no binary
* deepspeed binary fixes hopefully
* install deepspeed by itself
* multiline fix
* make sure ninja is installed
* try with reversion of packaging/setuptools/wheel install
* use license instead of license-file
* try rolling back packaging and setuptools versions
* comment out license for validation for now
* make sure packaging version is consistent
* more parity across tests and docker images for packaging/setuptools
2025-03-21 10:17:25 -04:00
Wing Lian
38df5a36ea
bump HF versions except for trl ( #2427 )
2025-03-20 10:22:05 -04:00
Wing Lian
4d92a68a96
use default torch fused adamw optimizer as default as adamw_hf is deprecated ( #2425 )
...
* use default torch fused adamw optimizer as default as adamw_hf is deprecated
* make sure to have latest packaging installed
* bump packagingin requirements.txt too
2025-03-19 23:58:33 -04:00
SicariusSicariiStuff
85147ec430
Update README.md ( #2360 )
...
* Update README.md
wheel is needed
* feat: add ninja, setuptools, packing to installation steps
* fix: add missing instruction
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2025-03-17 08:39:17 -04:00
NanoCode012
51cd409488
Feat: minor docs improvements for RLHF and faq on embeddings ( #2401 ) [skip ci]
...
* feat: add doc on shrink_embeddings and custom calling
* chore: rename inference doc
* fix: clarify same config is used for all cli
* chore: rearrange order inference qmd
* feat: add simpo to doc
* fix: update defaults
* feat: add rl configs to doc
* fix: ensure beta consistent with trl.beta
* fix: clarify about lora/fft
* chore: rename title
* chore: fix language
* feat: move config reference higher
* Update docs/getting-started.qmd
Co-authored-by: salman <salman.mohammadi@outlook.com >
* Update docs/rlhf.qmd
Co-authored-by: salman <salman.mohammadi@outlook.com >
---------
Co-authored-by: salman <salman.mohammadi@outlook.com >
2025-03-17 08:39:04 -04:00
NanoCode012
7235123d44
chore(docs): add cookbook/blog link to docs ( #2410 ) [skip ci]
2025-03-17 08:38:19 -04:00
Wing Lian
4f5eb42a73
remove reference to deprecated import ( #2407 )
2025-03-15 08:49:41 -04:00
Wing Lian
fbe54be6b8
only validate hf user token on rank 0 ( #2408 )
2025-03-13 23:29:06 -04:00
Wing Lian
04f6324833
build cloud images with torch 2.6.0 ( #2413 )
...
* build cloud images with torch 2.6.0
* nightlies too
2025-03-13 23:28:51 -04:00
Wing Lian
f0072f3b9d
use max of 32 dataset processes if not explicit ( #2403 )
...
* use max of 32 dataset processes if not explicit
* change alternate min val for consistency
2025-03-11 12:02:58 -04:00
Wing Lian
59899b9817
pass additional info for fix untrained tokens when using distributed + offloading ( #2388 )
...
* pass additional info for fix untrained tokens when using distributed + offloading
* use latest version of vendored lib
* use v0.0.5 of contribs lgpl
* fix for no bad tokens and add tests
* use release
* add multigpu test too
* make sure the multigpu zero3 test actually uses zero3
2025-03-11 12:02:43 -04:00
NanoCode012
4a736986fa
fix(modal): add git pull when getting branch files ( #2399 )
2025-03-10 15:14:41 -04:00
Wing Lian
5d0f110a3b
include iproute2 and nvtop in cloud image ( #2393 )
2025-03-10 15:13:38 -04:00