Update README.md to reflect current gradient checkpointing support
Previously the readme stated gradient checkpointing was incompatible with 4-bit lora in the current implementation however this is no longer the case. I have replaced the warning with a link to the hugging face documentation on gradient checkpointing.
This commit is contained in:
@@ -387,7 +387,7 @@ train_on_inputs: false
|
||||
# don't use this, leads to wonky training (according to someone on the internet)
|
||||
group_by_length: false
|
||||
|
||||
# does not work with current implementation of 4-bit LoRA
|
||||
# Whether to use gradient checkpointing https://huggingface.co/docs/transformers/v4.18.0/en/performance#gradient-checkpointing
|
||||
gradient_checkpointing: false
|
||||
|
||||
# stop training after this many evaluation losses have increased in a row
|
||||
|
||||
Reference in New Issue
Block a user