* add grpo scale_rewards config for trl#3135 * options to connect to vllm server directly w grpo trl#3094 * temperature support trl#3029 * sampling/generation kwargs for grpo trl#2989 * make vllm_enable_prefix_caching a config param trl#2900 * grpo multi-step optimizeations trl#2899 * remove overrides for grpo trainer * bump trl to 0.16.0 * add cli to start vllm-serve via trl * call the python module directly * update to use vllm with 2.6.0 too now and call trl vllm serve from module * vllm 0.8.1 * use python3 * use sys.executable * remove context and wait for start * fixes to make it actually work * fixes so the grpo tests pass with new vllm paradigm * explicit host/port and check in start vllm * make sure that vllm doesn't hang by setting quiet so outouts go to dev null * also bump bnb to latest release * add option for wait from cli and nccl debugging for ci * grpo + vllm test on separate devices for now * make sure grpo + vllm tests runs single worker since pynccl comms would conflict * fix cli * remove wait and add caching for argilla dataset * refactoring configs * chore: lint * add vllm config * fixup vllm grpo args * fix one more incorrect schema/config path * fix another vlllm reference and increase timeout * make the tests run a bit faster * change mbsz back so it is correct for grpo * another change mbsz back so it is correct for grpo * fixing cli args * nits * adding docs * docs * include tensor parallel size for vllm in pydantic schema * moving start_vllm, more docs * limit output len for grpo vllm * vllm enable_prefix_caching isn't a bool cli arg * fix env ordering in tests and also use pid check when looking for vllm --------- Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>
67 lines
992 B
Plaintext
67 lines
992 B
Plaintext
--extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
|
|
|
|
# START section of dependencies that don't install on Darwin/MacOS
|
|
bitsandbytes==0.45.4
|
|
triton>=3.0.0
|
|
mamba-ssm==1.2.0.post1
|
|
xformers>=0.0.23.post1
|
|
autoawq==0.2.7.post3
|
|
liger-kernel==0.5.5
|
|
# END section
|
|
|
|
packaging==23.2
|
|
|
|
peft==0.15.0
|
|
transformers==4.50.0
|
|
tokenizers>=0.21.1
|
|
accelerate==1.5.2
|
|
datasets==3.5.0
|
|
deepspeed==0.16.4
|
|
trl==0.16.0
|
|
|
|
optimum==1.16.2
|
|
hf_transfer
|
|
sentencepiece
|
|
gradio==3.50.2
|
|
|
|
modal==0.70.5
|
|
pydantic==2.10.6
|
|
addict
|
|
fire
|
|
PyYAML>=6.0
|
|
requests
|
|
wandb
|
|
einops
|
|
colorama
|
|
numba
|
|
numpy>=1.24.4,<=2.0.1
|
|
|
|
# qlora things
|
|
evaluate==0.4.1
|
|
scipy
|
|
scikit-learn==1.4.2
|
|
nvidia-ml-py==12.560.30
|
|
art
|
|
tensorboard
|
|
python-dotenv==1.0.1
|
|
|
|
# remote filesystems
|
|
s3fs>=2024.5.0
|
|
gcsfs>=2024.5.0
|
|
# adlfs
|
|
|
|
zstandard==0.22.0
|
|
fastcore
|
|
|
|
# lm eval harness
|
|
lm_eval==0.4.7
|
|
langdetect==1.0.9
|
|
immutabledict==4.2.0
|
|
antlr4-python3-runtime==4.13.2
|
|
|
|
torchao==0.7.0
|
|
schedulefree==1.3.0
|
|
|
|
axolotl-contribs-lgpl==0.0.6
|
|
axolotl-contribs-mit==0.0.3
|