* hf offline decorator for tests to workaround rate limits
* fail quicker so we can see logs
* try new cache name
* limit files downloaded
* phi mini predownload
* offline decorator for phi tokenizer
* handle meta llama 8b offline too
* make sure to return fixtures if they are wrapped too
* more fixes
* more things offline
* more offline things
* fix the env var
* fix the model name
* handle gemma also
* force reload of modules to recheck offline status
* prefetch mistral too
* use reset_sessions so hub picks up offline mode
* more fixes
* rename so it doesn't seem like a context manager
* fix backoff
* switch out tinyshakespeare dataset since it runs a py script to fetch data and doesn't work offline
* include additional dataset
* more fixes
* more fixes
* replace tiny shakespeaere dataset
* skip some tests for now
* use more robust check using snapshot download to determine if a dataset name is on the hub
* typo for skip reason
* use local_files_only
* more fixtures
* remove local only
* use tiny shakespeare as pretrain dataset and streaming can't be offline even if precached
* make sure fixtures aren't offline
improve the offline reset
try bumping version of datasets
reorder reloading and setting
prime a new cache
run the tests now with fresh cache
try with a static cache
* now run all the ci again with hopefully a correct cache
* skip wonky tests for now
* skip wonky tests for now
* handle offline mode for model card creation
* override special tokens mock code
* fix(doc): remove duplicate config
* feat: replace added_tokens in tokenizer and add test
* make sure to run tokenizer modification on rank 0 only
* use is local main process instead
* feat: rename config
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
Co-authored-by: Wing Lian <wing@axolotl.ai>
* Support for additional_special_tokens
* Support for additional_special_tokens. Adjust whitespace.
* Support for additional_special_tokens. Use correct quotes.
* Support for additional_special_tokens. Safe pop.
* Support for additional_special_tokens. nt.
* Support for additional_special_tokens. cfg.special_tokens may be None.
* add token if not in vocabulary when adding additional_special_tokens
* fix logic for copy/pasta
* bugfix for popping from config and tokenizer reload
* no need to add tokens manually now with previous bugfix
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* Feat: Auto add to modules_to_save when adding tokens
* fix: swap to error instead of warning
* feat: add check when special_tokens differ and add test