Commit Graph

313 Commits

Author SHA1 Message Date
Wing Lian
810ecd4e81 add liger to readme (#1865)
* add liger to readme

* updates from PR feedback
2024-08-23 14:34:03 -04:00
Wing Lian
b33dc07a77 rename nightly test and add badge (#1853) 2024-08-22 13:13:33 -04:00
Wing Lian
dcbff16983 run nightly ci builds against upstream main (#1851)
* run nightly ci builds against upstream main

* add test badges

* run the multigpu tests against nightly main builds too
2024-08-22 13:10:54 -04:00
Gal Cohen (galco)
957c956f89 rename jamba example (#1846) [skip ci]
* rename jamba example

* feat: change readme

---------

Co-authored-by: Gal Cohen <galc@ai21.com>
2024-08-22 09:22:55 -04:00
mhenrichsen
3bc8e64557 Update README.md (#1792) 2024-07-30 07:59:53 +02:00
Wing Lian
7830fe04b5 Unsloth rope (#1767)
* Add unsloth rope embeddings support

* support for models weights in 4bit and do some memory gc

* use accelerate logger

* add unsloth llama rms norm optims

* update docs for unsloth

* more docs info
2024-07-18 14:54:41 -04:00
David Meikle
634f384e06 Changed URL for dataset docs (#1744) 2024-07-13 14:34:28 -04:00
mhenrichsen
1194c2e0b1 github urls (#1734)
Co-authored-by: Henrichsen, Mads (ext) <mads.henrichsen.ext@siemens-energy.com>
2024-07-11 09:19:29 -04:00
Saeed Esmaili
5cde06587a Fix the broken link in README (#1678) [skip ci] 2024-06-03 09:38:44 -04:00
Abe Voelker
49b967b62f Fix README quick start example usage model dirs (#1668) 2024-05-28 18:10:40 -04:00
Wing Lian
3319780300 update torch 2.2.1 -> 2.2.2 (#1622) 2024-05-15 09:45:27 -04:00
Chansung Park
5d97e65f95 add dstack section (#1612) [skip ci]
* add dstack section

* chore: lint

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-05-14 08:13:45 -04:00
Atlas
bcaa92325d Update Readme to include support for Mixtral8X22B (#1518) [skip ci] 2024-04-17 01:15:30 -04:00
YTING
7d9bafcb88 Update README.md (#1521) [skip ci] 2024-04-17 01:15:05 -04:00
Wing Lian
e07dcb288c add docs around pre-processing (#1529) 2024-04-16 19:45:46 -04:00
NanoCode012
c2b64e4dcf Feat: update doc (#1475) [skip ci]
* feat: update doc contents

* chore: move batch vs ga docs

* feat: update lambdalabs instructions

* fix: refactor dev instructions
2024-04-04 13:43:40 +09:00
James Melvin Ebenezer
cae608f587 Added pip install ninja to accelerate installation of flash-attn (#1461)
* Added pip install ninja to accelerate installation of flash-attn

* doc: cleanup
2024-04-02 17:36:41 +09:00
Hamel Husain
86b7d22f35 Reorganize Docs (#1468) 2024-04-01 08:00:52 -07:00
Phuc Van Phan
324d59ea0d docs: update link to docs of advance topic in README.md (#1437) 2024-03-24 21:49:27 -07:00
NanoCode012
f1ebaa07c6 chore(config): refactor old mistral config (#1435)
* chore(config): refactor old mistral config

* chore: add link to colab on readme
2024-03-25 12:00:44 +09:00
Hamel Husain
629450cecd Bootstrap Hosted Axolotl Docs w/Quarto (#1429)
* precommit

* mv styes.css

* fix links
2024-03-21 22:28:36 -07:00
Wing Lian
dd449c5cd8 support galore once upstreamed into transformers (#1409)
* support galore once upstreamed into transformers

* update module name for llama in readme and fix typing for all linear

* bump trl for deprecation fixes from newer transformers

* include galore as an extra and install in docker image

* fix optim_args type

* fix optim_args

* update dependencies for galore

* add galore to cicd dockerfile
2024-03-19 09:26:35 -04:00
NanoCode012
40a88e8c4a Feat: Add sharegpt multirole (#1137)
* feat(prompt): support multiple roles for sharegpt

* fix: add handling of empty role back

* feat: rebased and allowed more dynamic roles via config

* fix: variable

* chore: update message

* feat: add vicuna format

* fix: JSON serializable error

* fix: typing

* fix: don't remap for unknown keys

* fix: add roles to pydantic

* feat: add test

* chore: remove leftover print

* chore: remove leftover comment

* chore: remove print

* fix: update test to use chatml
2024-03-19 20:51:49 +09:00
Seungduk Kim
43bdc5d3de Add a config not to shuffle merged dataset (#1394) [skip ci]
* Add a config not to shuffle merged dataset

* Update README.md

* Update src/axolotl/utils/config/models/input/v0_4_1/__init__.py

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* invert the condition name

* update README

* info -> debug

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-03-19 20:51:00 +09:00
NanoCode012
b1e3e1b25f fix(config): passing gradient_checkpoint_kwargs (#1412)
* fix(config): change default use_reentrant to true

* Update trainer_builder.py

* fix: make sure to pass kwargs to enable checkpoint

* chore: lint
2024-03-19 12:57:43 +09:00
jbl
e8c8ea64b3 Update README.md (#1418)
Add Phorm AI Badge
2024-03-17 23:47:46 -04:00
NanoCode012
f083aed2c7 Fix(readme): Improve README QuickStart info (#1408)
* Fix(readme): Improve README QuickStart info

* chore: add to toc
2024-03-16 21:10:22 +09:00
NanoCode012
868c33954d Feat(readme): Add instructions for Google GPU VM instances (#1410) 2024-03-16 21:10:05 +09:00
Hamel Husain
8b12468230 Add QLoRA + FSDP Docs (#1403)
* pre commit

* Update fsdp_qlora.md
2024-03-14 11:04:51 -04:00
Wing Lian
638c2dafb5 JarvisLabs (#1372)
* add Jarvis cloud gpu and sponsorship

* whitespace
2024-03-07 10:47:32 -05:00
Hamel Husain
ed70a08348 add docs for input_output format (#1367) [skip ci]
* add docs

* add docs

* run linter
2024-03-06 09:09:49 -05:00
Nicolas Rojas
37657473c8 Remove unsupported python version 3.9 from README (#1364) [skip ci] 2024-03-05 21:19:36 -05:00
Wing Lian
6b3b271925 fix for protected model_ namespace w pydantic (#1345) 2024-02-28 15:07:49 -05:00
Maxime
0f6af36d50 Mps mistral lora (#1292) [skip ci]
* Lora example for Mistral on MPS backend

* Add some MPS documentation

* Update examples/mistral/lora-mps.yml

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* Update examples/mistral/lora-mps.yml

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* Update README.md

---------

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-02-26 22:39:57 -05:00
JohanWork
d75653407c ADD: push checkpoints to mlflow artifact registry (#1295) [skip ci]
* Add checkpoint logging to mlflow artifact registry

* clean up

* Update README.md

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* update pydantic config from rebase

---------

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-02-26 13:32:39 -05:00
NanoCode012
c6b01e0f4a chore: update readme to be more clear (#1326) [skip ci] 2024-02-26 13:32:13 -05:00
Wing Lian
cc3cebfa70 Pydantic 2.x cfg (#1239)
* WIP conversion to use pydantic for config validation

* wip, more fields, add capabilities

* wip

* update pydantic validation to match existing tests

* tweak requirements

* setup deprecated paams pydantic model

* more validations

* wrap up rest of the validations

* flesh out the rest of the options from the readme into pydantic

* fix model validators as class methods

remember to return in validator
missing return
add missing relora attributes
fix test for DictDefault change
fix sys template for mistral from fastchat change in PR 2872
fix test for batch size warning

* more missing attributes for cfg

* updates from PR feedback

* fix validation for datasets and pretrain datasets

* fix test for lora check
2024-02-26 12:24:14 -05:00
NanoCode012
2ed52bd568 fix(readme): Clarify doc for tokenizer_config (#1323) [skip ci] 2024-02-24 21:55:04 +09:00
NanoCode012
3d2cd804ae fix(readme): update inference md link (#1311) [skip ci] 2024-02-22 02:48:06 +09:00
Leonardo Emili
5a5d47458d Add seq2seq eval benchmark callback (#1274)
* Add CausalLMBenchEvalCallback for measuring seq2seq performance

* Fix code for pre-commit

* Fix typing and improve logging

* eval_sample_packing must be false with CausalLMBenchEvalCallback
2024-02-13 08:24:30 -08:00
김진원
8430db22e2 Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 2024-02-12 21:23:28 -08:00
Wing Lian
4b997c3e1a allow the optimizer prune ratio for ReLoRA to be configurable (#1287)
* allow the optimizer prune ration for relora to be configurable

* update docs for relora

* prevent circular imports
2024-02-12 11:39:51 -08:00
Hamel Husain
b2a4cb4396 Update README.md (#1281) 2024-02-09 07:38:08 -08:00
Hamel Husain
9bca7db133 add support for https remote yamls (#1277) 2024-02-08 20:02:17 -08:00
Hamel Husain
91cf4ee72c allow remote data paths (#1278)
* allow remote data paths

* add docs about public url

* only allow https

* better docs

* better docs
2024-02-08 15:02:35 -08:00
Wing Lian
1daecd161e copy edits (#1276) 2024-02-08 09:00:04 -05:00
Wing Lian
4a654b331e Add link to axolotl cloud image on latitude (#1275) 2024-02-08 08:50:11 -05:00
Wing Lian
411293bdca contributor avatars (#1269) 2024-02-07 07:09:01 -08:00
Wing Lian
dfd188502a add contact info for dedicated support for axolotl [skip ci] (#1243) 2024-02-01 12:59:07 -05:00
Wing Lian
00568c1539 support for true batches with multipack (#1230)
* support for true batches with multipack

* patch the map dataset fetcher to handle batches with packed indexes

* patch 4d mask creation for sdp attention

* better handling for BetterTransformer

* patch general case for 4d mask

* setup forward patch. WIP

* fix patch file

* support for multipack w/o flash attention for llama

* cleanup

* add warning about bf16 vs fp16 for multipack with sdpa

* bugfixes

* add 4d multipack tests, refactor patches

* update tests and add warnings

* fix e2e file check

* skip sdpa test if not at least torch 2.1.1, update docs
2024-02-01 10:18:42 -05:00