* ipo-dpo trainer * fix missing abstract method * chatml template, grad checkpointing kwargs support * fix steps calc for RL and add dataloader kwargs * wip to fix dpo and start ppo * more fixes * refactor to generalize map fn * fix dataset loop and handle argilla pref dataset * set training args * load reference model on seperate gpu if more than one device * no auto upload to hub for dpo, don't add lora adapters to ref model for dpo * fixes for rl training * support for ipo from yaml * set dpo training args from the config, add tests * chore: lint * set sequence_len for model in test * add RLHF docs
42 lines
694 B
Plaintext
42 lines
694 B
Plaintext
--extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
|
|
auto-gptq==0.5.1
|
|
packaging
|
|
peft==0.6.0
|
|
transformers @ git+https://github.com/huggingface/transformers.git@3cefac1d974db5e2825a0cb2b842883a628be7a0
|
|
tokenizers==0.15.0
|
|
bitsandbytes>=0.41.1
|
|
accelerate==0.24.1
|
|
deepspeed
|
|
addict
|
|
fire
|
|
PyYAML>=6.0
|
|
datasets>=2.15.0
|
|
flash-attn==2.3.3
|
|
sentencepiece
|
|
wandb
|
|
einops
|
|
xformers==0.0.22
|
|
optimum==1.13.2
|
|
hf_transfer
|
|
colorama
|
|
numba
|
|
numpy>=1.24.4
|
|
# qlora things
|
|
bert-score==0.3.13
|
|
evaluate==0.4.0
|
|
rouge-score==0.1.2
|
|
scipy
|
|
scikit-learn==1.2.2
|
|
pynvml
|
|
art
|
|
fschat==0.2.34
|
|
gradio==3.50.2
|
|
tensorboard
|
|
|
|
# remote filesystems
|
|
s3fs
|
|
gcsfs
|
|
# adlfs
|
|
|
|
trl @ git+https://github.com/huggingface/trl.git@main
|