* upgrade liger to 0.3.1
* update docs and example
* skip duplicate code check
* Update src/axolotl/integrations/liger/args.py
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* Update README.md
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* add logging
* chore: lint
* add test case
* upgrade liger and transformers
* also upgrade accelerate
* use kwargs to support patch release
* make sure prepared path is empty for test
* use transfromers 4.46.1 since 4.46.2 breaks fsdp
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* remove skipped test
* use mean_resizing_embeddings with qlora and added tokens
* use </s> as pad_token to prevent resize of embeddings
* make sure local hub test saves to a tmp dir
* use Path so concatenation works
* make sure to use tmp_ds_path for data files
* Allow using tokenizer's default chat template with fallbacks
Summary of changes:
1. Adds `tokenizer_default` as option for `chat_template` in
`chat_template` prompt strategy that allows using the chat template
from tokenizer's config.json
2. Allows falling back to chat templates available in axolotl if
tokenizer does not have a chat template
3. Adds a mistral chat template which supports system message - taken
from https://github.com/chujiezheng/chat_templates/blob/main/chat_templates/mistral-instruct.jinja
---
Why?
Many popular models are not trained with chatml format. As a result for
the model to correctly learn chatml we have to turn on train_on_inputs
which requires more compute and time. If we can use the model's already
learned chat template we can just learn the output tokens
---
Todo:
- Write tests
* Add tests
* Fix lint and bug post merge from main
* Add option `chat_template_jinja` to provide a jinja template
* remove custom mistral template
* Address review comments and add docs
* Update docs/dataset-formats/conversation.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* fix: set default to tokenizer template
* Merge branch 'main' into cj_tokenizer_default_prompt_template
* chore: remove redundant function
* fix: re-arrange enum declaration position
* fix: refactor artifact left from main merge
* feat(doc): updated config with chat template options and clarified examples
* chore: clarify doc
* chore: added example for non-default template
* chore: refactor
* fix: test
* fix: config being dropped and unittest to catch that
* chore: lint
* chore: skip duplicate
* fix: rename var after merge
* feat: add test for levy's dpo case
* fix: remove default setting on edge case where chat template overriden in dataset section
* feat: handle sharegpt deprecation better in docs
* feat: add example using fallback
* feat: handles chat_template requiring specific user/assistant order
* fix: update test based on new defaults
* fix: imported name incorrectly updated on merge
* chore: lint
* fix: update dummy message to prevent potential overlap with real content
* fix(doc): formatting
* fix: update bradleyterry to use new chat_template
---------
Co-authored-by: Chirag Jain <jain.chirag925@gmail.com>
* feat: support new arg num_items_in_batch
* use kwargs to manage extra unknown kwargs for now
* upgrade against upstream transformers main
* make sure trl is on latest too
* fix for upgraded trl
* fix: handle trl and transformer signature change
* feat: update trl to handle transformer signature
* RewardDataCollatorWithPadding no longer has max_length
* handle updated signature for tokenizer vs processor class
* invert logic for tokenizer vs processor class
* processing_class, not processor class
* also handle processing class in dpo
* handle model name w model card creation
* upgrade transformers and add a loss check test
* fix install of tbparse requirements
* make sure to add tbparse to req
* feat: revert kwarg to positional kwarg to be explicit
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* Ensure hf_mlflow_log_artifact config var is set in env
* Add transformer MLflowCallback to callbacks list when mlflow enabled
* Test hf_mlflow_log_artifacts is set correctly
* Test mlflow not being used by default
use a constraint file
use min version of xformers
don't install autoawq with pytorch 2.5.0
debugging for errors
upgrade pip first
fix action yml
add back try/except
retry w/o constraint
use --no-build-isolation
show torch version
install setuptools and wheel
add back try/except
* add ds zero3 to multigpu biweekly tests
* fix for upstream api change
* use updated accelerate and fix deepspeed tests
* stringify the Path, and run multigpu tests if the multigpu tests change for a PR
* use correct json rather than yaml
* revert accelerate for deepspeed
* wip add new proposed message structure
* tokenization
* wip
* wip transform builder
* wip make the chat dataset loadable
* wip chatml + llama 3 new chat objects
* chore: lint
* chore: lint
* fix tokenization
* remove dacite dependency since we're using pydantic now
* fix handling when already correctly split in messages
* make sure to remove chat features from tokenized ds
* move chat to be a input transform for messages
* make sure llama3 has the bos token
* remove non-working special token code
* fix messages strat loader
* Add support for `revision` dataset parameter
* only use revision on hf hub backed datasets
* use revision tied to head
* set download to use revision
* feat: add config to model validator class
* feat: add revision config to RL and tests for it
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* wip, lm_eval harness post train
* include latex parser
* add dtype and doc
* add validation when doing bench evals
* automatically add test dataset when doing benches
* Add first version of a Comet integration
* Remove debug prints
* Add test for Comet Configuration transformation to env variables
* Fix last lint warning
* Update Readme for Comet logging documentation
* Update Comet integration to be optional, update code and tests
* Add documentation for Comet configuration
* Add missing check