Commit Graph

1850 Commits

Author SHA1 Message Date
Sunny Liu
5ca57cb55a undo bool conversion 2025-01-23 17:56:13 -05:00
Sunny Liu
0149de7fb0 mask to bool 2025-01-23 15:30:08 -05:00
Sunny Liu
8c34c65181 dummy 2025-01-23 14:56:26 -05:00
Sunny Liu
555aa5772a skip mask conversion if already 4d 2025-01-23 14:01:53 -05:00
Sunny Liu
e8b2789086 revert mask expand 2025-01-23 11:20:38 -05:00
Sunny Liu
85752cdfc9 mask expansion 2025-01-22 21:33:38 -05:00
Sunny Liu
f2f23c8041 mask expansion 2025-01-22 21:31:42 -05:00
Sunny Liu
8b3eec7f6e mask expansion 2025-01-22 21:29:52 -05:00
Sunny Liu
bb9bea3110 mask expansion 2025-01-22 21:27:25 -05:00
Sunny Liu
0dd18a3681 llama sdpa patching WIP - static class function import 2025-01-22 21:10:05 -05:00
Sunny Liu
152e988d3c llama sdpa patching WIP - static class function import 2025-01-22 21:02:26 -05:00
Sunny Liu
27532825a9 llama sdpa patching WIP - static class function import 2025-01-22 21:00:34 -05:00
Sunny Liu
06f83a54a5 llama sdpa patching WIP - static class function import 2025-01-22 20:45:44 -05:00
Sunny Liu
d7b133dc1f llama sdpa patching WIP - static class function import 2025-01-22 20:33:13 -05:00
Sunny Liu
f3bec17917 llama sdpa patching WIP - static class function import 2025-01-22 20:25:26 -05:00
Sunny Liu
b7deb5241c llama sdpa patching WIP 2025-01-22 20:16:27 -05:00
Sunny Liu
cee310dcfa llama sdpa patching WIP 2025-01-22 20:15:23 -05:00
Sunny Liu
d1be6e228d llama sdpa patching WIP 2025-01-22 20:14:20 -05:00
Sunny Liu
5f9f77f384 llama patch 2025-01-22 11:29:28 -05:00
bursteratom
b2a34380b3 sample packing doc mask creation WIP 2025-01-21 09:18:38 -05:00
Sunny Liu
80bfc50d1f get seqlens from position ids for foc masking 2025-01-17 17:22:04 -05:00
Sunny Liu
a5360c172c llama hijacking 2025-01-17 15:54:03 -05:00
Sunny Liu
013a9b73fc fix transformers version for testing 2025-01-16 15:32:57 -05:00
Sunny
aad62428e0 not sure if this is necessary actually 2025-01-16 15:08:34 -05:00
Sunny
a6f2c5d583 flex sample packing WIP 2025-01-15 21:12:33 -05:00
Sunny
dbcd11e533 revert seq len in multipack sampler 2025-01-14 11:45:35 -05:00
Sunny
c06a6be915 flex_attn sample packing WIP 2025-01-14 00:22:05 -05:00
bursteratom
d3a0cb5edb transformers version 2025-01-13 10:33:00 -05:00
bursteratom
8b47e456b0 revert to transformers 4.47.1 2025-01-13 10:29:27 -05:00
Sunny Liu
2319ac729c Merge branch 'main' into flx_attn_support 2025-01-13 09:42:58 -05:00
Sunny
f99cae0e7b llama test 2025-01-12 17:30:19 -05:00
Wing Lian
888cd9407f use 2.5.1 docker images as latest tag as it seems stable (#2198) 2025-01-12 13:34:17 -05:00
Wing Lian
bd62d6e10a rename liger test so it properly runs in ci (#2246) 2025-01-12 13:34:17 -05:00
NanoCode012
5eae134110 feat: add support for data_files in pretraining (#2238) 2025-01-12 13:34:17 -05:00
Wing Lian
b7d27bdfa4 update upstream HF deps (#2239)
* bump axolotl contribs for upstream main conflicts:

* bump datasets, tokenizer, trl

* remove log workarounds in trl

* bump lm-eval

* remove unsloth_ import from critical path

* remove llama fa2 from conftest

* unsloth breaks with latest upstream
2025-01-12 13:34:17 -05:00
Vincenzo di Cicco
da97a21bdc Use SequentialSampler if curriculum_sampling is enabled with sample_packing (#2235) 2025-01-12 13:34:17 -05:00
Wing Lian
e0d4b88598 update modal version for ci (#2242) 2025-01-12 13:34:17 -05:00
NanoCode012
fac059a209 fix: mistral nemo does not recognize token_type_ids in forward (#2233) 2025-01-12 13:34:17 -05:00
Wing Lian
9c9ac1cf0b add hf cache caching for GHA (#2247)
* add hf cache caching for GHA

* use modal volume to cache hf data

* make sure to update the cache as we add new fixtures in conftest
2025-01-12 13:34:17 -05:00
Wing Lian
2346f21b2b Merge group queue (#2248)
* add support for merge groups

* also lint merge groups
2025-01-12 13:34:17 -05:00
salman
0b47281f51 Fixing OSX installation (#2231)
* bumping version, removing non-osx compatible deps

* updating pylintrc

* fixing linters

* reverting changes
2025-01-12 13:34:17 -05:00
Wing Lian
d8b4027200 use 2.5.1 docker images as latest tag as it seems stable (#2198) 2025-01-10 08:35:25 -05:00
Wing Lian
fb3352e21c rename liger test so it properly runs in ci (#2246) 2025-01-09 17:31:43 -05:00
Sunny
543daaf46f llama test 2025-01-09 16:08:24 -05:00
NanoCode012
ed77e7001e feat: add support for data_files in pretraining (#2238) 2025-01-09 21:04:13 +00:00
Wing Lian
7669a03fb4 update upstream HF deps (#2239)
* bump axolotl contribs for upstream main conflicts:

* bump datasets, tokenizer, trl

* remove log workarounds in trl

* bump lm-eval

* remove unsloth_ import from critical path

* remove llama fa2 from conftest

* unsloth breaks with latest upstream
2025-01-09 21:01:59 +00:00
Vincenzo di Cicco
6553683170 Use SequentialSampler if curriculum_sampling is enabled with sample_packing (#2235) 2025-01-09 21:01:22 +00:00
Wing Lian
5e0124e2ab update modal version for ci (#2242) 2025-01-09 21:01:02 +00:00
NanoCode012
2e8d7c1adb fix: mistral nemo does not recognize token_type_ids in forward (#2233) 2025-01-09 21:00:36 +00:00
Wing Lian
3c1921e400 add hf cache caching for GHA (#2247)
* add hf cache caching for GHA

* use modal volume to cache hf data

* make sure to update the cache as we add new fixtures in conftest
2025-01-09 20:59:54 +00:00