Sunny Liu
152e988d3c
llama sdpa patching WIP - static class function import
2025-01-22 21:02:26 -05:00
Sunny Liu
27532825a9
llama sdpa patching WIP - static class function import
2025-01-22 21:00:34 -05:00
Sunny Liu
06f83a54a5
llama sdpa patching WIP - static class function import
2025-01-22 20:45:44 -05:00
Sunny Liu
d7b133dc1f
llama sdpa patching WIP - static class function import
2025-01-22 20:33:13 -05:00
Sunny Liu
f3bec17917
llama sdpa patching WIP - static class function import
2025-01-22 20:25:26 -05:00
Sunny Liu
b7deb5241c
llama sdpa patching WIP
2025-01-22 20:16:27 -05:00
Sunny Liu
cee310dcfa
llama sdpa patching WIP
2025-01-22 20:15:23 -05:00
Sunny Liu
d1be6e228d
llama sdpa patching WIP
2025-01-22 20:14:20 -05:00
Sunny Liu
5f9f77f384
llama patch
2025-01-22 11:29:28 -05:00
Wing Lian
8fb72cbc0b
use the extracted field_messages to parse the role fields ( #2265 )
2025-01-21 15:39:30 -05:00
Adithya Kamath
bb9d4102c4
Add 5000 line history limit to tmux for docker cloud ( #2268 )
2025-01-21 15:39:17 -05:00
bursteratom
b2a34380b3
sample packing doc mask creation WIP
2025-01-21 09:18:38 -05:00
Wing Lian
af727eedf7
option to not concatenate during pretraining ( #2263 )
...
* option to not concatenate during pretraining
* simplify conditional and add doc to config.qmd
2025-01-20 14:07:34 -05:00
Sunny Liu
80bfc50d1f
get seqlens from position ids for foc masking
2025-01-17 17:22:04 -05:00
Sunny Liu
a5360c172c
llama hijacking
2025-01-17 15:54:03 -05:00
Sunny Liu
013a9b73fc
fix transformers version for testing
2025-01-16 15:32:57 -05:00
Sunny
aad62428e0
not sure if this is necessary actually
2025-01-16 15:08:34 -05:00
Sunny
a6f2c5d583
flex sample packing WIP
2025-01-15 21:12:33 -05:00
jwongTensora
8606093921
fix for indexing error from token/embeddings mismatch ( #2257 )
...
Co-authored-by: jwong <jwongTensora@gmail.com >
2025-01-14 22:09:29 -05:00
NanoCode012
cba5a457d9
fix: use text_column even when not packing for pretraining ( #2254 )
...
* fix: use text_column even when not packing for pretraining
* feat: update test to check when not packing
* chore: lint
* Update src/axolotl/utils/data/pretraining.py
Co-authored-by: Wing Lian <wing.lian@gmail.com >
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2025-01-14 22:08:56 -05:00
Wing Lian
19cd83d408
rename references to dpo dataset prep to pref data ( #2258 )
2025-01-14 22:07:55 -05:00
Sunny
dbcd11e533
revert seq len in multipack sampler
2025-01-14 11:45:35 -05:00
Sunny
c06a6be915
flex_attn sample packing WIP
2025-01-14 00:22:05 -05:00
Dan Saunders
1ed4de73b6
CLI cleanup and documentation ( #2244 )
...
* CLI init refactor
* fix
* cleanup and (partial) docs
* Adding documentation and continuing cleanup (in progress)
* remove finetune.py script
* continued cleanup and documentation
* pytest fixes
* review comments
* fix
* Fix
* typing fixes
* make sure the batch dataset patcher for multipack is always loaded when handling datasets
* review comments
* fix
---------
Co-authored-by: Dan Saunders <dan@axolotl.ai >
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-01-13 17:55:29 +00:00
Wing Lian
f89e962119
skip over rows in pretraining dataset ( #2223 )
...
* skip over rows in pretraining dataset
* update docs
2025-01-13 10:44:45 -05:00
Wing Lian
bc1c9c20e3
assume empty lora dropout means 0.0 and add tests ( #2243 )
...
* assume empty lora dropout means 0.0 and add tests
* remove un-necessary arg
* refactor based on pr feedback:
* chore: lint
2025-01-13 10:44:11 -05:00
Wing Lian
dd26cc3c0f
add helper to verify the correct model output file exists ( #2245 )
...
* add helper to verify the correct model output file exists
* more checks using helper
* chore: lint
* fix import and relora model check
* workaround for trl trainer saves
* remove stray print
2025-01-13 10:43:29 -05:00
bursteratom
d3a0cb5edb
transformers version
2025-01-13 10:33:00 -05:00
bursteratom
8b47e456b0
revert to transformers 4.47.1
2025-01-13 10:29:27 -05:00
Sunny Liu
2319ac729c
Merge branch 'main' into flx_attn_support
2025-01-13 09:42:58 -05:00
Sunny
f99cae0e7b
llama test
2025-01-12 17:30:19 -05:00
Wing Lian
888cd9407f
use 2.5.1 docker images as latest tag as it seems stable ( #2198 )
2025-01-12 13:34:17 -05:00
Wing Lian
bd62d6e10a
rename liger test so it properly runs in ci ( #2246 )
2025-01-12 13:34:17 -05:00
NanoCode012
5eae134110
feat: add support for data_files in pretraining ( #2238 )
2025-01-12 13:34:17 -05:00
Wing Lian
b7d27bdfa4
update upstream HF deps ( #2239 )
...
* bump axolotl contribs for upstream main conflicts:
* bump datasets, tokenizer, trl
* remove log workarounds in trl
* bump lm-eval
* remove unsloth_ import from critical path
* remove llama fa2 from conftest
* unsloth breaks with latest upstream
2025-01-12 13:34:17 -05:00
Vincenzo di Cicco
da97a21bdc
Use SequentialSampler if curriculum_sampling is enabled with sample_packing ( #2235 )
2025-01-12 13:34:17 -05:00
Wing Lian
e0d4b88598
update modal version for ci ( #2242 )
2025-01-12 13:34:17 -05:00
NanoCode012
fac059a209
fix: mistral nemo does not recognize token_type_ids in forward ( #2233 )
2025-01-12 13:34:17 -05:00
Wing Lian
9c9ac1cf0b
add hf cache caching for GHA ( #2247 )
...
* add hf cache caching for GHA
* use modal volume to cache hf data
* make sure to update the cache as we add new fixtures in conftest
2025-01-12 13:34:17 -05:00
Wing Lian
2346f21b2b
Merge group queue ( #2248 )
...
* add support for merge groups
* also lint merge groups
2025-01-12 13:34:17 -05:00
salman
0b47281f51
Fixing OSX installation ( #2231 )
...
* bumping version, removing non-osx compatible deps
* updating pylintrc
* fixing linters
* reverting changes
2025-01-12 13:34:17 -05:00
Wing Lian
d8b4027200
use 2.5.1 docker images as latest tag as it seems stable ( #2198 )
2025-01-10 08:35:25 -05:00
Wing Lian
fb3352e21c
rename liger test so it properly runs in ci ( #2246 )
2025-01-09 17:31:43 -05:00
Sunny
543daaf46f
llama test
2025-01-09 16:08:24 -05:00
NanoCode012
ed77e7001e
feat: add support for data_files in pretraining ( #2238 )
2025-01-09 21:04:13 +00:00
Wing Lian
7669a03fb4
update upstream HF deps ( #2239 )
...
* bump axolotl contribs for upstream main conflicts:
* bump datasets, tokenizer, trl
* remove log workarounds in trl
* bump lm-eval
* remove unsloth_ import from critical path
* remove llama fa2 from conftest
* unsloth breaks with latest upstream
2025-01-09 21:01:59 +00:00
Vincenzo di Cicco
6553683170
Use SequentialSampler if curriculum_sampling is enabled with sample_packing ( #2235 )
2025-01-09 21:01:22 +00:00
Wing Lian
5e0124e2ab
update modal version for ci ( #2242 )
2025-01-09 21:01:02 +00:00
NanoCode012
2e8d7c1adb
fix: mistral nemo does not recognize token_type_ids in forward ( #2233 )
2025-01-09 21:00:36 +00:00
Wing Lian
3c1921e400
add hf cache caching for GHA ( #2247 )
...
* add hf cache caching for GHA
* use modal volume to cache hf data
* make sure to update the cache as we add new fixtures in conftest
2025-01-09 20:59:54 +00:00