* point to upstream autoawq for transformers fix
* use autoawq 0.2.7 release
* test wheel for awq
* try different format for wheel def
* autoawq re-release
* Add intel_extension_for_pytorch dep
* ipex gte version
* forcefully remove intel-extension-for-pytorch
* add -y option to pip uninstall for ipex
* use post2 release for autoawq and remove uninstall of ipex
* upgrade liger to 0.3.1
* update docs and example
* skip duplicate code check
* Update src/axolotl/integrations/liger/args.py
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* Update README.md
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* add logging
* chore: lint
* add test case
* upgrade liger and transformers
* also upgrade accelerate
* use kwargs to support patch release
* make sure prepared path is empty for test
* use transfromers 4.46.1 since 4.46.2 breaks fsdp
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* feat: support new arg num_items_in_batch
* use kwargs to manage extra unknown kwargs for now
* upgrade against upstream transformers main
* make sure trl is on latest too
* fix for upgraded trl
* fix: handle trl and transformer signature change
* feat: update trl to handle transformer signature
* RewardDataCollatorWithPadding no longer has max_length
* handle updated signature for tokenizer vs processor class
* invert logic for tokenizer vs processor class
* processing_class, not processor class
* also handle processing class in dpo
* handle model name w model card creation
* upgrade transformers and add a loss check test
* fix install of tbparse requirements
* make sure to add tbparse to req
* feat: revert kwarg to positional kwarg to be explicit
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
use a constraint file
use min version of xformers
don't install autoawq with pytorch 2.5.0
debugging for errors
upgrade pip first
fix action yml
add back try/except
retry w/o constraint
use --no-build-isolation
show torch version
install setuptools and wheel
add back try/except
* add ds zero3 to multigpu biweekly tests
* fix for upstream api change
* use updated accelerate and fix deepspeed tests
* stringify the Path, and run multigpu tests if the multigpu tests change for a PR
* use correct json rather than yaml
* revert accelerate for deepspeed
* wip, lm_eval harness post train
* include latex parser
* add dtype and doc
* add validation when doing bench evals
* automatically add test dataset when doing benches
* update sklearn versrion, torch compile env vars, don't worry about failure on preprocess load model
* There is already a condition check within the function. This outer one is not necessary
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* fix 405b with lower cpu ram requirements
* make sure to use doouble quant and only skip output embeddings
* set model attributes
* more fixes for sharded fsdp loading
* update the base model in example to use pre-quantized nf4-bf16 weights
* upstream fixes for qlora+fsdp
* various batch of fixes
* more tweaks
* fix autoawq requirement for torch flexibility
* simplify conditionals
* multi-node fixes wip
* bump transformers and include 405b qlora+fsdp yaml
* swaps to use newer sample packing for mistral
* fix multipack patch test
* patch the common fa utils
* update for refactor of flash attn unpad
* remove un-needed drop attn mask for mistral
* bump transformers to main to pick up latest mistral fix for 12b and refactor of fa2
* update test
* bump transformers and set roundup_power2_divisions for more VRAM improvements
* support for low bit optimizers from torch ao
* fix check for alternate optimizers and use nous models on hf for llama3
* add missing check for ao_adamw_fp8
* fix check when using custom optimizers w adamw
* bump flash attention 2.5.8 -> 2.6.1
* use triton implementation of cross entropy from flash attn
* add smoke test for flash attn cross entropy patch
* fix args to xentropy.apply
* handle tuple from triton loss fn
* ensure the patch tests run independently
* use the wrapper already built into flash attn for cross entropy
* mark pytest as forked for patches
* use pytest xdist instead of forked, since cuda doesn't like forking
* limit to 1 process and use dist loadfile for pytest
* change up pytest for fixture to reload transformers w monkeypathc
* Update requirements.txt
Preserve compatibility with torch 2.3.1. [Reference](https://github.com/facebookresearch/xformers/issues/1052)
* fix setup.py to extract the current xformers dep from requirements for replacement
* xformers 0.0.27 wheels not built for torch 2.3.0
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* adding llama3 fastchat conversation monkeypatch
* Updated conversation turns to work with PR3259 of FastChat
* fixed bos token
* bump fastchat version
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* Add support for Gemma chat template
* Update fschat version to include its newest support for Gemma chat style
* pin fastchat to current HEAD
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* WIP use trl ORPOTrainer
* fixes to make orpo work with trl
* fix the chat template laoding
* make sure to handle the special tokens and add_generation for assistant turn too