Byron Hsu
e3a38450de
Add liger kernel to features ( #1881 ) [skip ci]
2024-08-29 08:19:18 -04:00
Wing Lian
810ecd4e81
add liger to readme ( #1865 )
...
* add liger to readme
* updates from PR feedback
2024-08-23 14:34:03 -04:00
Wing Lian
b33dc07a77
rename nightly test and add badge ( #1853 )
2024-08-22 13:13:33 -04:00
Wing Lian
dcbff16983
run nightly ci builds against upstream main ( #1851 )
...
* run nightly ci builds against upstream main
* add test badges
* run the multigpu tests against nightly main builds too
2024-08-22 13:10:54 -04:00
Gal Cohen (galco)
957c956f89
rename jamba example ( #1846 ) [skip ci]
...
* rename jamba example
* feat: change readme
---------
Co-authored-by: Gal Cohen <galc@ai21.com >
2024-08-22 09:22:55 -04:00
mhenrichsen
3bc8e64557
Update README.md ( #1792 )
2024-07-30 07:59:53 +02:00
Wing Lian
7830fe04b5
Unsloth rope ( #1767 )
...
* Add unsloth rope embeddings support
* support for models weights in 4bit and do some memory gc
* use accelerate logger
* add unsloth llama rms norm optims
* update docs for unsloth
* more docs info
2024-07-18 14:54:41 -04:00
David Meikle
634f384e06
Changed URL for dataset docs ( #1744 )
2024-07-13 14:34:28 -04:00
mhenrichsen
1194c2e0b1
github urls ( #1734 )
...
Co-authored-by: Henrichsen, Mads (ext) <mads.henrichsen.ext@siemens-energy.com >
2024-07-11 09:19:29 -04:00
Saeed Esmaili
5cde06587a
Fix the broken link in README ( #1678 ) [skip ci]
2024-06-03 09:38:44 -04:00
Abe Voelker
49b967b62f
Fix README quick start example usage model dirs ( #1668 )
2024-05-28 18:10:40 -04:00
Wing Lian
3319780300
update torch 2.2.1 -> 2.2.2 ( #1622 )
2024-05-15 09:45:27 -04:00
Chansung Park
5d97e65f95
add dstack section ( #1612 ) [skip ci]
...
* add dstack section
* chore: lint
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-05-14 08:13:45 -04:00
Atlas
bcaa92325d
Update Readme to include support for Mixtral8X22B ( #1518 ) [skip ci]
2024-04-17 01:15:30 -04:00
YTING
7d9bafcb88
Update README.md ( #1521 ) [skip ci]
2024-04-17 01:15:05 -04:00
Wing Lian
e07dcb288c
add docs around pre-processing ( #1529 )
2024-04-16 19:45:46 -04:00
NanoCode012
c2b64e4dcf
Feat: update doc ( #1475 ) [skip ci]
...
* feat: update doc contents
* chore: move batch vs ga docs
* feat: update lambdalabs instructions
* fix: refactor dev instructions
2024-04-04 13:43:40 +09:00
James Melvin Ebenezer
cae608f587
Added pip install ninja to accelerate installation of flash-attn ( #1461 )
...
* Added pip install ninja to accelerate installation of flash-attn
* doc: cleanup
2024-04-02 17:36:41 +09:00
Hamel Husain
86b7d22f35
Reorganize Docs ( #1468 )
2024-04-01 08:00:52 -07:00
Phuc Van Phan
324d59ea0d
docs: update link to docs of advance topic in README.md ( #1437 )
2024-03-24 21:49:27 -07:00
NanoCode012
f1ebaa07c6
chore(config): refactor old mistral config ( #1435 )
...
* chore(config): refactor old mistral config
* chore: add link to colab on readme
2024-03-25 12:00:44 +09:00
Hamel Husain
629450cecd
Bootstrap Hosted Axolotl Docs w/Quarto ( #1429 )
...
* precommit
* mv styes.css
* fix links
2024-03-21 22:28:36 -07:00
Wing Lian
dd449c5cd8
support galore once upstreamed into transformers ( #1409 )
...
* support galore once upstreamed into transformers
* update module name for llama in readme and fix typing for all linear
* bump trl for deprecation fixes from newer transformers
* include galore as an extra and install in docker image
* fix optim_args type
* fix optim_args
* update dependencies for galore
* add galore to cicd dockerfile
2024-03-19 09:26:35 -04:00
NanoCode012
40a88e8c4a
Feat: Add sharegpt multirole ( #1137 )
...
* feat(prompt): support multiple roles for sharegpt
* fix: add handling of empty role back
* feat: rebased and allowed more dynamic roles via config
* fix: variable
* chore: update message
* feat: add vicuna format
* fix: JSON serializable error
* fix: typing
* fix: don't remap for unknown keys
* fix: add roles to pydantic
* feat: add test
* chore: remove leftover print
* chore: remove leftover comment
* chore: remove print
* fix: update test to use chatml
2024-03-19 20:51:49 +09:00
Seungduk Kim
43bdc5d3de
Add a config not to shuffle merged dataset ( #1394 ) [skip ci]
...
* Add a config not to shuffle merged dataset
* Update README.md
* Update src/axolotl/utils/config/models/input/v0_4_1/__init__.py
Co-authored-by: Wing Lian <wing.lian@gmail.com >
* invert the condition name
* update README
* info -> debug
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-03-19 20:51:00 +09:00
NanoCode012
b1e3e1b25f
fix(config): passing gradient_checkpoint_kwargs ( #1412 )
...
* fix(config): change default use_reentrant to true
* Update trainer_builder.py
* fix: make sure to pass kwargs to enable checkpoint
* chore: lint
2024-03-19 12:57:43 +09:00
jbl
e8c8ea64b3
Update README.md ( #1418 )
...
Add Phorm AI Badge
2024-03-17 23:47:46 -04:00
NanoCode012
f083aed2c7
Fix(readme): Improve README QuickStart info ( #1408 )
...
* Fix(readme): Improve README QuickStart info
* chore: add to toc
2024-03-16 21:10:22 +09:00
NanoCode012
868c33954d
Feat(readme): Add instructions for Google GPU VM instances ( #1410 )
2024-03-16 21:10:05 +09:00
Hamel Husain
8b12468230
Add QLoRA + FSDP Docs ( #1403 )
...
* pre commit
* Update fsdp_qlora.md
2024-03-14 11:04:51 -04:00
Wing Lian
638c2dafb5
JarvisLabs ( #1372 )
...
* add Jarvis cloud gpu and sponsorship
* whitespace
2024-03-07 10:47:32 -05:00
Hamel Husain
ed70a08348
add docs for input_output format ( #1367 ) [skip ci]
...
* add docs
* add docs
* run linter
2024-03-06 09:09:49 -05:00
Nicolas Rojas
37657473c8
Remove unsupported python version 3.9 from README ( #1364 ) [skip ci]
2024-03-05 21:19:36 -05:00
Wing Lian
6b3b271925
fix for protected model_ namespace w pydantic ( #1345 )
2024-02-28 15:07:49 -05:00
Maxime
0f6af36d50
Mps mistral lora ( #1292 ) [skip ci]
...
* Lora example for Mistral on MPS backend
* Add some MPS documentation
* Update examples/mistral/lora-mps.yml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* Update examples/mistral/lora-mps.yml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* Update README.md
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-02-26 22:39:57 -05:00
JohanWork
d75653407c
ADD: push checkpoints to mlflow artifact registry ( #1295 ) [skip ci]
...
* Add checkpoint logging to mlflow artifact registry
* clean up
* Update README.md
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* update pydantic config from rebase
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-02-26 13:32:39 -05:00
NanoCode012
c6b01e0f4a
chore: update readme to be more clear ( #1326 ) [skip ci]
2024-02-26 13:32:13 -05:00
Wing Lian
cc3cebfa70
Pydantic 2.x cfg ( #1239 )
...
* WIP conversion to use pydantic for config validation
* wip, more fields, add capabilities
* wip
* update pydantic validation to match existing tests
* tweak requirements
* setup deprecated paams pydantic model
* more validations
* wrap up rest of the validations
* flesh out the rest of the options from the readme into pydantic
* fix model validators as class methods
remember to return in validator
missing return
add missing relora attributes
fix test for DictDefault change
fix sys template for mistral from fastchat change in PR 2872
fix test for batch size warning
* more missing attributes for cfg
* updates from PR feedback
* fix validation for datasets and pretrain datasets
* fix test for lora check
2024-02-26 12:24:14 -05:00
NanoCode012
2ed52bd568
fix(readme): Clarify doc for tokenizer_config ( #1323 ) [skip ci]
2024-02-24 21:55:04 +09:00
NanoCode012
3d2cd804ae
fix(readme): update inference md link ( #1311 ) [skip ci]
2024-02-22 02:48:06 +09:00
Leonardo Emili
5a5d47458d
Add seq2seq eval benchmark callback ( #1274 )
...
* Add CausalLMBenchEvalCallback for measuring seq2seq performance
* Fix code for pre-commit
* Fix typing and improve logging
* eval_sample_packing must be false with CausalLMBenchEvalCallback
2024-02-13 08:24:30 -08:00
김진원
8430db22e2
Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? ( #1273 )
2024-02-12 21:23:28 -08:00
Wing Lian
4b997c3e1a
allow the optimizer prune ratio for ReLoRA to be configurable ( #1287 )
...
* allow the optimizer prune ration for relora to be configurable
* update docs for relora
* prevent circular imports
2024-02-12 11:39:51 -08:00
Hamel Husain
b2a4cb4396
Update README.md ( #1281 )
2024-02-09 07:38:08 -08:00
Hamel Husain
9bca7db133
add support for https remote yamls ( #1277 )
2024-02-08 20:02:17 -08:00
Hamel Husain
91cf4ee72c
allow remote data paths ( #1278 )
...
* allow remote data paths
* add docs about public url
* only allow https
* better docs
* better docs
2024-02-08 15:02:35 -08:00
Wing Lian
1daecd161e
copy edits ( #1276 )
2024-02-08 09:00:04 -05:00
Wing Lian
4a654b331e
Add link to axolotl cloud image on latitude ( #1275 )
2024-02-08 08:50:11 -05:00
Wing Lian
411293bdca
contributor avatars ( #1269 )
2024-02-07 07:09:01 -08:00
Wing Lian
dfd188502a
add contact info for dedicated support for axolotl [skip ci] ( #1243 )
2024-02-01 12:59:07 -05:00