* add lisa support
* fix default and fix attribute traversal for layers
* improve lisa callback logging
* fix LISA by ensuring params are not frozen during __init__
* example config for lisa
---------
Co-authored-by: Aman Karmani <aman@tmm1.net>
* support galore once upstreamed into transformers
* update module name for llama in readme and fix typing for all linear
* bump trl for deprecation fixes from newer transformers
* include galore as an extra and install in docker image
* fix optim_args type
* fix optim_args
* update dependencies for galore
* add galore to cicd dockerfile
* Add a config not to shuffle merged dataset
* Update README.md
* Update src/axolotl/utils/config/models/input/v0_4_1/__init__.py
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* invert the condition name
* update README
* info -> debug
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* orpo trainer
* rl handling for orpo
* support for remove_unused_columns
* orpo fixes
* fix loader for orpo
* chore: lint
* fix default for remove_unused_columns
* roll ORPO into the main AxolotlTrainer so it can be compatible with some of the other techniques like relora
* better handling of system message for orpo
* revert system prompt changes for chat templtes
* no need for else condition
* split dataset parsing into it's own component
* Add Glaive conversation format support
* fix black formatting errors
* Fix black and pylint formatting errors
* only set role_key_tool if provided in the dataset constructor
* Update src/axolotl/prompt_strategies/sharegpt.py
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* sharegpt test
* tokenizer test
* fix formatting
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* wip qlora + fsdp fixes
* more fixes
* make sure to load the lora 🤦
* only setup quantized meta on non-zero rank:
* only run setup_quantized_peft_meta_for_training for qlora+fsdp
* more fixes for qlora+fsdp
* chore: lint
* add example yml
* support mistral too
* fix for model_type and add mixtral support too
* set cpu_offload: false to reduce vram, constrain new accleerator logic to qlora + fsdp
* refactor for duplicate code
* plain input/output prompt strategy w/o chat templates
* disable duplicate code check
* make sure to add an eos/eot token to the end of the output so it will stop
* multi turn segement support and test
* add missing evals_per_epoch setting
* more pydantic fixes
* more fixes
* move test from normalization to validation
* increase eval size for sample packing tests
* support user-defined prompt processing strategies for dpo
* interpret dict dataset types as user-defined
* fix lint errors
* setup pydantic config for validation of User defined DPO
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>