NanoCode012
ef942b6efc
fix: rename var after merge
2024-10-11 12:30:43 +07:00
NanoCode012
3c6a6c61be
Merge branch 'main' into cj_tokenizer_default_prompt_template
2024-10-11 12:29:34 +07:00
NanoCode012
7b4b665e99
chore: skip duplicate
2024-10-11 11:42:36 +07:00
NanoCode012
21326e4ef3
chore: lint
2024-10-11 11:40:42 +07:00
NanoCode012
de23dab4fc
fix: config being dropped and unittest to catch that
2024-10-11 11:40:32 +07:00
NanoCode012
e3efa29cf5
fix: test
2024-10-11 11:11:19 +07:00
Wing Lian
2fbc6b0c64
Axo logo new ( #1956 )
...
* update axolotl ascii art
* spacing for logo
* cleanup dithering
* cleanup ascii logo a bit
2024-10-10 15:57:37 -04:00
Wing Lian
8159cbd1ab
lm_eval harness post train ( #1926 )
...
* wip, lm_eval harness post train
* include latex parser
* add dtype and doc
* add validation when doing bench evals
* automatically add test dataset when doing benches
2024-10-10 15:04:17 -04:00
NanoCode012
2038255052
Merge branch 'main' into cj_tokenizer_default_prompt_template
2024-10-10 20:25:37 +07:00
pandora
979534c851
add mistral templates ( #1927 )
...
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-10-10 09:22:53 -04:00
NanoCode012
dab2590e4d
chore: refactor
2024-10-10 18:07:00 +07:00
NanoCode012
e5162b7a41
chore: added example for non-default template
2024-10-10 18:04:33 +07:00
NanoCode012
b6321d2220
chore: clarify doc
2024-10-10 18:01:33 +07:00
NanoCode012
6b3cdfdb8e
feat(doc): updated config with chat template options and clarified examples
2024-10-10 17:57:11 +07:00
NanoCode012
203ae28704
fix: refactor artifact left from main merge
2024-10-10 17:16:41 +07:00
NanoCode012
ed3a33c9fb
fix: re-arrange enum declaration position
2024-10-10 16:18:15 +07:00
NanoCode012
f61e2fc7dc
chore: remove redundant function
2024-10-10 16:15:15 +07:00
NanoCode012
b8056d04d9
Merge branch 'main' into cj_tokenizer_default_prompt_template
2024-10-10 16:11:07 +07:00
NanoCode012
88658c0570
fix: set default to tokenizer template
2024-10-10 15:38:19 +07:00
Boris Feld
6d3caadf90
Comet integration ( #1939 )
...
* Add first version of a Comet integration
* Remove debug prints
* Add test for Comet Configuration transformation to env variables
* Fix last lint warning
* Update Readme for Comet logging documentation
* Update Comet integration to be optional, update code and tests
* Add documentation for Comet configuration
* Add missing check
2024-10-09 16:03:37 -04:00
aarush gupta
dee77232fe
fix type annotations ( #1941 ) [skip ci]
2024-10-09 16:03:16 -04:00
NanoCode012
a560593b1d
fix(log): update perplexity log to clarify from eval split ( #1952 ) [skip ci]
2024-10-09 16:02:32 -04:00
Wing Lian
e8d3da0081
upgrade pytorch from 2.4.0 => 2.4.1 ( #1950 )
...
* upgrade pytorch from 2.4.0 => 2.4.1
* update xformers for updated pytorch version
* handle xformers version case for torch==2.3.1
2024-10-09 11:53:56 -04:00
Wing Lian
4ca0a47cfb
add 2.4.1 to base models ( #1953 )
2024-10-09 08:43:11 -04:00
Wing Lian
e1915f5625
Multimodal Vision Llama - rudimentary support ( #1940 )
...
---------
Co-authored-by: Sunny <sunny@Sunnys-MacBook-Air.local >
Co-authored-by: sunny <sunnyliu19981005@gmail.com >
2024-10-02 21:02:48 -04:00
Wing Lian
844331005c
bump transformers to 4.45.1 ( #1936 )
2024-09-30 13:56:12 -04:00
Wing Lian
61aa291119
fix for empty lora+ lr embedding ( #1932 )
2024-09-27 15:58:35 -04:00
Wing Lian
b98d7d7098
update upstream deps versions and replace lora+ ( #1928 )
...
* update upstream deps versions and replace lora+
* typo transformers version
2024-09-26 11:33:41 -04:00
Wing Lian
d7eea2ff34
validation fixes 20240923 ( #1925 )
...
* validation fixes 20240923
* fix run name for wandb and defaults for chat template fields
* fix gradio inference with llama chat template
2024-09-24 14:05:58 -04:00
Keith Stevens
7b9f669a3a
Trigger the original tokenization behavior when no advanced turn settings are provided ( #1915 )
2024-09-14 08:22:54 -04:00
Wing Lian
5c42f11411
remove dynamic module loader monkeypatch as this was fixed upstream ( #1914 )
2024-09-13 22:19:54 -04:00
Chirag Jain
260ca97f2c
Merge branch 'main' into cj_tokenizer_default_prompt_template
2024-09-13 00:33:49 +05:30
Wing Lian
3853ab7ae9
bump accelerate to 0.34.2 ( #1901 )
...
* bump accelerate
* add fixture to predownload the test model
* change fixture
2024-09-07 14:39:31 -04:00
Wing Lian
6e354682e3
fix zero3 integration ( #1897 )
...
* fix zero3 integration
* bump transformers and accelerate too
2024-09-05 10:58:50 -04:00
Alpay Ariyak
ab461d83c4
Fix documentation for pre-tokenized dataset ( #1894 )
...
It's currently asking to not add BOS and EOS, stating that Axolotl adds them, but this is not true
2024-09-05 23:11:31 +09:00
Wing Lian
93b769a979
lint fix and update gha regex ( #1899 )
2024-09-05 09:58:21 -04:00
Tijmen de Haan
f18f4268b5
Docs for AMD-based HPC systems ( #1891 )
...
* Add documentation for installing on AMD-based HPC systems.
* Accept suggestion to add note about deepspeed
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* Update _quarto.yml with amd_hpc doc
---------
Co-authored-by: Tijmen de Haan <tijmen.dehaan@gmail.com >
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2024-09-05 18:33:19 +09:00
Wing Lian
dca1fe47d4
fix optimizer + fsdp combination in example ( #1893 )
2024-09-04 11:28:47 -04:00
Wing Lian
4e5400c732
support for auto_find_batch_size when packing ( #1885 )
...
* support for auto_find_batch_size when packing
* make sure to return data from validation
* make sure to return data from validation
* actually expose multipack_real_batches in the config
* calculate gathered efficiency in sampler
* tweak to fix auto find and use actual sampler len for multipack
* uncomment
* use args for bsz when not available from auto find
2024-09-03 20:02:44 -04:00
Wing Lian
0aeb277456
add e2e smoke tests for llama liger integration ( #1884 )
...
* add e2e smoke tests for llama liger integration
* fix import
* don't use __main__ for test
* consolidate line
2024-09-01 19:29:37 -04:00
Chiwan Park
bdab3ec587
Fix RMSNorm monkey patch for Gemma models ( #1886 )
2024-09-01 18:34:24 -04:00
Wing Lian
3c6b9eda2e
run pytests with varied pytorch versions too ( #1883 )
2024-08-31 22:49:35 -04:00
DocShotgun
15408d0f09
Update supported models for Liger Kernel ( #1875 )
...
* Update supported models for Liger Kernel
Add Mistral LCE, Gemma LCE, Gemma 2 without LCE (softcapping is not yet implemented for Gemma in Liger Kernel LCE forward), Phi3 without LCE
* move import to their appropriate conditions
* Integrate Phi3 LCE support
https://github.com/linkedin/Liger-Kernel/pull/103/
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-08-31 21:59:48 -04:00
Wing Lian
ce33e1ed83
pin liger-kernel to latest 0.2.1 ( #1882 ) [skip ci]
2024-08-30 17:51:18 -04:00
Byron Hsu
e3a38450de
Add liger kernel to features ( #1881 ) [skip ci]
2024-08-29 08:19:18 -04:00
Chirag Jain
b1bb2accb9
Merge branch 'main' into cj_tokenizer_default_prompt_template
2024-08-28 13:34:20 +05:30
Aman Gupta Karmani
7037e3c836
deepseekv2 liger support ( #1878 )
...
* deepseekv2 liger support
* add comment
* add missing impl
2024-08-27 23:52:40 -04:00
Aman Gupta Karmani
c1a61ae23c
fix liger plugin load issues ( #1876 )
2024-08-27 23:08:26 -04:00
Aman Gupta Karmani
159b8b9a74
monkey-patch transformers to simplify monkey-patching modeling code ( #1877 )
...
* monkey-patch transformers so that monkey-patched modeling code doesnt get overwritten
* unnecessary now
* add comment
2024-08-27 17:22:26 -07:00
Wing Lian
1e43660701
Sample pack trust remote code v2 ( #1873 )
...
* fix the multipack patch for remote code models
* add deepseek v2 lite example w fsdp
2024-08-27 13:39:24 -04:00