axolotl/examples at 132eb740f036eff0fa8b239ddaf0b7a359ed1732 - axolotl - Gitea

tocmo0nlord/axolotl

Files

History

Wing Lian 132eb740f0 DBRX Model Support (#1462 )

* wip for dbrx finetuning

* add fastcore for parallel loading of sharded weights

* fix dtype for load, use PartialState instead of accelerator to init process group, remove redundant wandb callback

* update to use v2 of the converted model

* more fixes for dbrx loras

* make sure to enable fsdp activation checkpointing

* fix support for 8bit loras too for dbrx

* apply z3 leaf moe fix for DBRX with deepspeed

* don't raise value error since child module searches could fail and be ok

* revert a previous change to fix fsdp

* update mistral/mistral qlora+fsdp yamls

* fix qlora+fsdp quant storage type

* more edge cases for qlora-fsdp

* fixes for fsdp+qlora w optimizer in 8bit

* add bigstral z3 config and make sure to use full_state_dict for fsdp

2024-04-12 09:02:36 -04:00

..

Update qlora.yml - remove max_packed_sequence_len (#1210 ) [skip ci]

2024-01-26 07:43:05 -05:00

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00

colab-notebooks

Add instructions for playing with qlora model to colab example (#1290 )

2024-02-22 02:46:27 +09:00

DBRX Model Support (#1462 )

2024-04-12 09:02:36 -04:00

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00

turn sample_packing on for training (#1438 ) [skip ci]

2024-03-26 15:19:04 -04:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

fix some of the edge cases for Jamba (#1452 )

2024-03-29 02:38:02 -04:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

DBRX Model Support (#1462 )

2024-04-12 09:02:36 -04:00

Add seq2seq eval benchmark callback (#1274 )

2024-02-13 08:24:30 -08:00

DBRX Model Support (#1462 )

2024-04-12 09:02:36 -04:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

Add shifted sparse attention (#973 ) [skip-ci]

2024-01-18 10:16:07 -05:00

Mixtral fixes 20240124 (#1192 ) [skip ci]

2024-01-24 14:59:57 -05:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

Feat(wandb): Refactor to be more flexible (#767 )

2023-12-04 22:17:25 +09:00

Fix the wrong adapter in qwen2-moe-qlora example (#1501 ) [skip ci]

2024-04-09 10:57:24 -04:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

Add StableLM 2 Example Scripts (#1327 ) [skip ci]

2024-02-26 18:44:25 -05:00

add starcoder2 (#1349 )

2024-03-05 19:49:17 -05:00

Update tinyllama lora.yml to fix eval packing issue (#1362 )

2024-03-05 14:36:29 -05:00

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci]

2024-01-22 18:44:01 -05:00

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00