Aman Gupta Karmani
11ddccb80f
Merge pull request #356 from tmm1/load_model-args
...
simplify `load_model` signature
2023-08-09 18:24:34 -07:00
Aman Karmani
718102271f
simplify load_model signature
2023-08-09 22:36:02 +00:00
Aman Karmani
e303d64728
log GPU memory usage
2023-08-09 18:26:28 +00:00
Wing Lian
176b888a63
ensure enable_input_require_grads is called on model before getting the peft model ( #345 )
2023-08-06 18:13:10 -04:00
Jan Philipp Harries
3392270544
experimental llama 2 chat support ( #296 )
...
* experimental llama 2 chat support
* few small fixes
* llama2_chat
* small fix to follow original implementation
* small fixes and added fixtures/tests
* fix -mixed up inference and finetuning conversations
* args - small fix
* small fix
* small adjustment and warning
* fix with pre-commit
---------
Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com >
2023-08-06 17:40:52 -04:00
ssmi153
10405b9995
Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) ( #339 )
...
* Fix XFormers attention for Llama-2 70B (GQA)
Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: 07360b6c9c (diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51) from their llama_modeling.py file.
* Catch configs without pretraining_tp
* Whitespace bug fix
Command had accidentally been moved out of if-else block.
* pre-commit formatting fixes
Thanks to @winglian
2023-08-06 11:09:04 -04:00
Jan Philipp Harries
c93655c0a3
Added Orca Mini prompt strategy ( #263 )
...
* added Orca Mini prompt strategy
* maybe this fixed precommit errors?
* pre-commits passing
---------
Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com >
2023-08-06 03:16:41 +09:00
Wing Lian
fe285430bc
optimize the iteration when tokenizeing large datasets ( #332 )
2023-08-04 12:12:05 -04:00
Aman Karmani
2eda9e02a9
fix typo
2023-08-03 21:04:12 +00:00
Aman Karmani
78b9efb7f4
scope flash-attn+qlora fix correctly, scope to llama, add comment
2023-08-03 19:19:39 +00:00
Aman Karmani
312a9fad07
move flash-attn monkey patch alongside the others
2023-08-03 17:20:49 +00:00
Aman Karmani
248bf90f89
ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
2023-08-02 20:15:03 +00:00
Wing Lian
77085ea24e
qlora w flash attention fixes ( #333 )
2023-08-01 23:26:16 -04:00
Wing Lian
db2a3586f3
add peft install back since it doesn't get installed by setup.py ( #331 )
2023-07-31 16:31:53 -04:00
Wing Lian
3d4984b9a5
update prompts for open orca to match the paper ( #317 )
...
fix the test for the updated system tokenizer
2023-07-22 13:49:11 -04:00
Wing Lian
40a53ff181
Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens
...
better handling since xgen tokenizer breaks with convert_tokens_to_ids
2023-07-22 04:10:38 -04:00
Wing Lian
3ffb018a4c
Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings
...
don't resize embeddings to multiples of 32x by default
2023-07-22 04:09:59 -04:00
Wing Lian
1066751358
don't resize embeddings to multiples of 32x by default
2023-07-22 01:52:38 -04:00
Wing Lian
2a428e8014
better handling since xgen tokenizer breaks with convert_tokens_to_ids
2023-07-21 09:24:11 -04:00
Wing Lian
9b790d359b
flash attention 2
2023-07-21 08:17:46 -04:00
Wing Lian
a032c9f452
fix sdp attention to use the flash/mem-efficient context manaager
2023-07-20 01:05:48 -04:00
NanoCode012
45ac7c4f88
feat: use multi-core
2023-07-19 10:16:54 +09:00
Wing Lian
ebaec3c406
fix axolotl training args dataclass annotation
2023-07-17 04:57:02 -04:00
Wing Lian
d75adb9835
misc fixes
2023-07-17 03:00:27 -04:00
Wing Lian
6f16c4569d
Merge pull request #276 from theobjectivedad/logging_enhancement
...
Logging update: added PID and formatting
2023-07-16 17:04:52 -04:00
theobjectivedad
b1f4f7a34d
Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var
2023-07-15 12:29:35 +00:00
The Objective Dad
83237b8445
Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement
2023-07-15 06:16:04 -05:00
Charles Goddard
88089e8b32
Add ability to pass 'name' argument to load_dataset
2023-07-14 16:46:39 -07:00
NanoCode012
168a7a09cc
Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2
...
Feat: Set push to hub as private by default
2023-07-14 23:15:47 +09:00
theobjectivedad
9234b75cb4
Update log message format, IMO this is easier to read.
2023-07-14 07:36:21 -05:00
theobjectivedad
553a86b52c
Adding logging enhancement
2023-07-14 07:26:19 -05:00
NanoCode012
5491278a79
Feat: Add save_safetensors
2023-07-14 13:21:47 +09:00
NanoCode012
1514739f0f
Set push to hub as private by default
2023-07-14 13:17:49 +09:00
Wing Lian
69a235061b
support for loading a model by git revision
2023-07-13 22:58:25 -04:00
Wing Lian
c4cf567b55
Merge branch 'main' into quadratic-warmup
2023-07-10 12:42:12 -04:00
Wing Lian
c49729d2bc
better configuration for quadratic warmup
2023-07-10 11:52:59 -04:00
Wing Lian
19cf0bda99
params are adam_*, not adamw_*
2023-07-08 12:13:39 -04:00
Wing Lian
d69da99c2c
skip explicit model type too if using trust_remote_code
2023-07-07 21:33:11 -04:00
Wing Lian
66afb76a15
don't use llama if trust_remote_code is set since that needs to use AutoModel path
2023-07-07 21:31:02 -04:00
Wing Lian
b9b7d4ce92
Merge pull request #221 from utensil/local_dataset
...
[WIP] Support loading data files from a local directory
2023-07-03 09:10:13 -04:00
NanoCode012
e79c8e617e
Fix future deprecation push_to_hub_model_id
2023-07-03 12:44:29 +09:00
Wing Lian
1e5014acec
Merge pull request #255 from OpenAccess-AI-Collective/open-orca-prompts
...
open orca support
2023-07-01 01:11:23 -04:00
Wing Lian
4066c78631
Merge pull request #246 from OpenAccess-AI-Collective/sys-prompts-instruct
...
add option for instruct w sys prompts
2023-07-01 00:27:29 -04:00
Wing Lian
78a1e1fa12
open orca support
2023-07-01 00:19:41 -04:00
NanoCode012
77bdb7d144
Fix typing list
2023-06-29 14:29:55 +09:00
Wing Lian
924bbfddec
add option for instruct w sys prompts
2023-06-28 22:27:17 -04:00
Wing Lian
f150c027e3
Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data
...
System prompt data
2023-06-27 17:57:43 -04:00
Wing Lian
612aabd8c4
push intermediate model checkpoints to hub
2023-06-27 15:40:25 -04:00
Wing Lian
05ab9092e3
skip the system prompt
2023-06-25 22:40:50 -04:00
Wing Lian
7b57ed7618
pylint for duplicated code for system prompts
2023-06-25 22:28:07 -04:00