Commit Graph

4 Commits

Author SHA1 Message Date
Wing Lian
2d8def68dc simplify by removing duplicate base_model_config (#772) 2023-10-23 01:42:38 -04:00
Wing Lian
e50a64e85e prepared dataset caching, other misc fixes (#665)
* prepared dataset caching, other misc fixes

* also don't load from disk cache unless explicit
2023-10-02 21:07:24 -04:00
Wing Lian
d887ad86c3 eval_table isn't quite stable enough to be in default llama configs (#637) 2023-09-26 10:13:20 -04:00
Glavin Wiechert
5b67ea98a6 Add training callback to send predictions to WandB table (#521)
* WIP Add training callback to send predictions to WandB table

* WIP improve wandb table reporting callback

* WIP improve wandb table reporting callback (cont)

* Add VSCode launching for debugging

* Add tiny llama example

* WIP attempt to improve post-eval prediction generation for table

* WIP attempt to improve post-eval prediction generation for table - part 2

* WIP batch generation

* WIP attempt to handle sample_packing using position_ids for wandb prediction table

* WIP add code for debugging

* Fix sample_packing support for wandb prediction table

* Clean up code for PR review

* Add eval_table_size, eval_table_max_new_tokens configs & clean up code

* Clean up PR, delete VSCode config, add tiny-llama example

* Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting
2023-09-13 09:51:08 -04:00