Sunny Liu
|
5ca57cb55a
|
undo bool conversion
|
2025-01-23 17:56:13 -05:00 |
|
Sunny Liu
|
0149de7fb0
|
mask to bool
|
2025-01-23 15:30:08 -05:00 |
|
Sunny Liu
|
8c34c65181
|
dummy
|
2025-01-23 14:56:26 -05:00 |
|
Sunny Liu
|
555aa5772a
|
skip mask conversion if already 4d
|
2025-01-23 14:01:53 -05:00 |
|
Sunny Liu
|
e8b2789086
|
revert mask expand
|
2025-01-23 11:20:38 -05:00 |
|
Sunny Liu
|
85752cdfc9
|
mask expansion
|
2025-01-22 21:33:38 -05:00 |
|
Sunny Liu
|
f2f23c8041
|
mask expansion
|
2025-01-22 21:31:42 -05:00 |
|
Sunny Liu
|
8b3eec7f6e
|
mask expansion
|
2025-01-22 21:29:52 -05:00 |
|
Sunny Liu
|
bb9bea3110
|
mask expansion
|
2025-01-22 21:27:25 -05:00 |
|
Sunny Liu
|
0dd18a3681
|
llama sdpa patching WIP - static class function import
|
2025-01-22 21:10:05 -05:00 |
|
Sunny Liu
|
152e988d3c
|
llama sdpa patching WIP - static class function import
|
2025-01-22 21:02:26 -05:00 |
|
Sunny Liu
|
27532825a9
|
llama sdpa patching WIP - static class function import
|
2025-01-22 21:00:34 -05:00 |
|
Sunny Liu
|
06f83a54a5
|
llama sdpa patching WIP - static class function import
|
2025-01-22 20:45:44 -05:00 |
|
Sunny Liu
|
d7b133dc1f
|
llama sdpa patching WIP - static class function import
|
2025-01-22 20:33:13 -05:00 |
|
Sunny Liu
|
f3bec17917
|
llama sdpa patching WIP - static class function import
|
2025-01-22 20:25:26 -05:00 |
|
Sunny Liu
|
b7deb5241c
|
llama sdpa patching WIP
|
2025-01-22 20:16:27 -05:00 |
|
Sunny Liu
|
cee310dcfa
|
llama sdpa patching WIP
|
2025-01-22 20:15:23 -05:00 |
|
Sunny Liu
|
d1be6e228d
|
llama sdpa patching WIP
|
2025-01-22 20:14:20 -05:00 |
|
Sunny Liu
|
5f9f77f384
|
llama patch
|
2025-01-22 11:29:28 -05:00 |
|
bursteratom
|
b2a34380b3
|
sample packing doc mask creation WIP
|
2025-01-21 09:18:38 -05:00 |
|
Sunny Liu
|
80bfc50d1f
|
get seqlens from position ids for foc masking
|
2025-01-17 17:22:04 -05:00 |
|
Sunny Liu
|
a5360c172c
|
llama hijacking
|
2025-01-17 15:54:03 -05:00 |
|
Sunny Liu
|
013a9b73fc
|
fix transformers version for testing
|
2025-01-16 15:32:57 -05:00 |
|
Sunny
|
aad62428e0
|
not sure if this is necessary actually
|
2025-01-16 15:08:34 -05:00 |
|
Sunny
|
a6f2c5d583
|
flex sample packing WIP
|
2025-01-15 21:12:33 -05:00 |
|
Sunny
|
dbcd11e533
|
revert seq len in multipack sampler
|
2025-01-14 11:45:35 -05:00 |
|
Sunny
|
c06a6be915
|
flex_attn sample packing WIP
|
2025-01-14 00:22:05 -05:00 |
|
bursteratom
|
d3a0cb5edb
|
transformers version
|
2025-01-13 10:33:00 -05:00 |
|
bursteratom
|
8b47e456b0
|
revert to transformers 4.47.1
|
2025-01-13 10:29:27 -05:00 |
|
Sunny Liu
|
2319ac729c
|
Merge branch 'main' into flx_attn_support
|
2025-01-13 09:42:58 -05:00 |
|
Sunny
|
f99cae0e7b
|
llama test
|
2025-01-12 17:30:19 -05:00 |
|
Wing Lian
|
888cd9407f
|
use 2.5.1 docker images as latest tag as it seems stable (#2198)
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
bd62d6e10a
|
rename liger test so it properly runs in ci (#2246)
|
2025-01-12 13:34:17 -05:00 |
|
NanoCode012
|
5eae134110
|
feat: add support for data_files in pretraining (#2238)
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
b7d27bdfa4
|
update upstream HF deps (#2239)
* bump axolotl contribs for upstream main conflicts:
* bump datasets, tokenizer, trl
* remove log workarounds in trl
* bump lm-eval
* remove unsloth_ import from critical path
* remove llama fa2 from conftest
* unsloth breaks with latest upstream
|
2025-01-12 13:34:17 -05:00 |
|
Vincenzo di Cicco
|
da97a21bdc
|
Use SequentialSampler if curriculum_sampling is enabled with sample_packing (#2235)
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
e0d4b88598
|
update modal version for ci (#2242)
|
2025-01-12 13:34:17 -05:00 |
|
NanoCode012
|
fac059a209
|
fix: mistral nemo does not recognize token_type_ids in forward (#2233)
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
9c9ac1cf0b
|
add hf cache caching for GHA (#2247)
* add hf cache caching for GHA
* use modal volume to cache hf data
* make sure to update the cache as we add new fixtures in conftest
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
2346f21b2b
|
Merge group queue (#2248)
* add support for merge groups
* also lint merge groups
|
2025-01-12 13:34:17 -05:00 |
|
salman
|
0b47281f51
|
Fixing OSX installation (#2231)
* bumping version, removing non-osx compatible deps
* updating pylintrc
* fixing linters
* reverting changes
|
2025-01-12 13:34:17 -05:00 |
|
Wing Lian
|
d8b4027200
|
use 2.5.1 docker images as latest tag as it seems stable (#2198)
|
2025-01-10 08:35:25 -05:00 |
|
Wing Lian
|
fb3352e21c
|
rename liger test so it properly runs in ci (#2246)
|
2025-01-09 17:31:43 -05:00 |
|
Sunny
|
543daaf46f
|
llama test
|
2025-01-09 16:08:24 -05:00 |
|
NanoCode012
|
ed77e7001e
|
feat: add support for data_files in pretraining (#2238)
|
2025-01-09 21:04:13 +00:00 |
|
Wing Lian
|
7669a03fb4
|
update upstream HF deps (#2239)
* bump axolotl contribs for upstream main conflicts:
* bump datasets, tokenizer, trl
* remove log workarounds in trl
* bump lm-eval
* remove unsloth_ import from critical path
* remove llama fa2 from conftest
* unsloth breaks with latest upstream
|
2025-01-09 21:01:59 +00:00 |
|
Vincenzo di Cicco
|
6553683170
|
Use SequentialSampler if curriculum_sampling is enabled with sample_packing (#2235)
|
2025-01-09 21:01:22 +00:00 |
|
Wing Lian
|
5e0124e2ab
|
update modal version for ci (#2242)
|
2025-01-09 21:01:02 +00:00 |
|
NanoCode012
|
2e8d7c1adb
|
fix: mistral nemo does not recognize token_type_ids in forward (#2233)
|
2025-01-09 21:00:36 +00:00 |
|
Wing Lian
|
3c1921e400
|
add hf cache caching for GHA (#2247)
* add hf cache caching for GHA
* use modal volume to cache hf data
* make sure to update the cache as we add new fixtures in conftest
|
2025-01-09 20:59:54 +00:00 |
|