bump transformers and set roundup_power2_divisions for more VRAM improvements, low bit ao optimizers (#1769)
* bump transformers and set roundup_power2_divisions for more VRAM improvements * support for low bit optimizers from torch ao * fix check for alternate optimizers and use nous models on hf for llama3 * add missing check for ao_adamw_fp8 * fix check when using custom optimizers w adamw
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
base_model: meta-llama/Meta-Llama-3-8B
|
||||
base_model: NousResearch/Meta-Llama-3-8B
|
||||
model_type: LlamaForCausalLM
|
||||
tokenizer_type: AutoTokenizer
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
||||
base_model: NousResearch/Meta-Llama-3-8B-Instruct
|
||||
model_type: LlamaForCausalLM
|
||||
tokenizer_type: AutoTokenizer
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
base_model: meta-llama/Meta-Llama-3-8B
|
||||
base_model: NousResearch/Meta-Llama-3-8B
|
||||
model_type: LlamaForCausalLM
|
||||
tokenizer_type: AutoTokenizer
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
base_model: meta-llama/Meta-Llama-3-8B
|
||||
base_model: NousResearch/Meta-Llama-3-8B
|
||||
model_type: AutoModelForCausalLM
|
||||
tokenizer_type: AutoTokenizer
|
||||
|
||||
|
||||
Reference in New Issue
Block a user