Commit Graph

1857 Commits

Author SHA1 Message Date
Wing Lian
74d98ca6d8 fix reward trainer calls for tokenization 2025-01-13 13:56:14 -05:00
Wing Lian
ec4dfb02c8 reward can use same batch check 2025-01-13 13:56:14 -05:00
Wing Lian
28ef5e8d5a tweak check for batched prompt data 2025-01-13 13:56:14 -05:00
Wing Lian
5ed2823855 ensure that batch vs single is done properly 2025-01-13 13:56:14 -05:00
Wing Lian
fb0775d264 improve iterable support 2025-01-13 13:56:12 -05:00
Wing Lian
7cd0a317cb support streaming for processing sft datasts? 2025-01-13 13:41:36 -05:00
Wing Lian
1cc3a2d16c make loss torch script compat 2025-01-13 13:41:36 -05:00
Wing Lian
287d2ca8d5 kd sample packing 2025-01-13 13:41:36 -05:00
Wing Lian
03b86df506 be a bit pickier about loading dynamic prompt strategies 2025-01-13 13:41:36 -05:00
Wing Lian
2ed4246949 more info on preprocess for kd and fix import 2025-01-13 13:41:35 -05:00
Wing Lian
35bc2e2d3f remove duplicate code 2025-01-13 13:41:35 -05:00
Wing Lian
94f1094805 add copyrights 2025-01-13 13:41:35 -05:00
Wing Lian
a0070bf94e increase logging around loading plugins 2025-01-13 13:41:35 -05:00
Wing Lian
2ee2ffd834 make plugin setup concise 2025-01-13 13:41:35 -05:00
Wing Lian
723b0a2dee remove moved class from import 2025-01-13 13:41:35 -05:00
Wing Lian
327739c9e3 move more things to kd plugin 2025-01-13 13:41:35 -05:00
Wing Lian
8aafe142f2 refactor kd chat template loader 2025-01-13 13:41:35 -05:00
Wing Lian
a0d6d8895e support for custom trainer classes from plugins 2025-01-13 13:41:34 -05:00
Wing Lian
55b33cc44d handle token/logprob shifting 2025-01-13 13:41:34 -05:00
Wing Lian
69ed25e82c remove references to triton kd for now 2025-01-13 13:41:34 -05:00
Wing Lian
2ea8b7e518 add license block 2025-01-13 13:41:34 -05:00
Wing Lian
aa081e0e76 refactor so we can easily add new loss functions 2025-01-13 13:41:34 -05:00
Wing Lian
3f97ec45fb chore: lint 2025-01-13 13:41:34 -05:00
Wing Lian
7b5a24b0d2 var naming and add todo 2025-01-13 13:41:34 -05:00
Wing Lian
4ddd089d0a fix kd loss so it's causal (fixes repeating tokens) 2025-01-13 13:41:34 -05:00
Wing Lian
b88128d067 use kd_alpha in the correct loss method 2025-01-13 13:41:32 -05:00
Wing Lian
2e6422a711 hash for temperature too 2025-01-13 13:40:19 -05:00
Wing Lian
6ad809287b better rescaling for temperatures 2025-01-13 13:40:19 -05:00
Wing Lian
e376e00386 don't use triton for now 2025-01-13 13:40:19 -05:00
Wing Lian
23d7ae6caa fix kwarg 2025-01-13 13:40:19 -05:00
Wing Lian
19638590d5 v3 2025-01-13 13:40:18 -05:00
Wing Lian
73f5b83431 no torch.tensor 2025-01-13 13:40:18 -05:00
Wing Lian
9b1164b841 no log etc 2025-01-13 13:40:18 -05:00
Wing Lian
5a7d6f6175 no torch.exp inside triton kernel 2025-01-13 13:40:18 -05:00
Wing Lian
a803c3d3ee v2 trial 2025-01-13 13:40:18 -05:00
Wing Lian
48ccf55752 no where support 2025-01-13 13:40:18 -05:00
Wing Lian
bc3326a808 triton wip 2025-01-13 13:40:18 -05:00
Wing Lian
cf8174db75 chore: lint 2025-01-13 13:40:18 -05:00
Wing Lian
222dc27410 make sure to multiply against the correct loss 2025-01-13 13:40:18 -05:00
Wing Lian
1107f1f603 cross entropy loss coefficient during KD 2025-01-13 13:40:18 -05:00
Wing Lian
1c603da96a flipped the slice 2025-01-13 13:40:17 -05:00
Wing Lian
283faf3909 make it work 2025-01-13 13:40:17 -05:00
Wing Lian
472f7048e5 handle padding/collation for KD datasets 2025-01-13 13:40:17 -05:00
Wing Lian
3d1e2dcef4 make batch smaller 2025-01-13 13:40:17 -05:00
Wing Lian
9e218fbcfd filter bad rows 2025-01-13 13:40:17 -05:00
Wing Lian
11caf52529 KD dataset loading and KD with logprobs 2025-01-13 13:40:17 -05:00
Wing Lian
17ba9dcfdb refactor trainer to prevent circular dependencies later
fix loader default
2025-01-13 13:40:17 -05:00
Dan Saunders
1ed4de73b6 CLI cleanup and documentation (#2244)
* CLI init refactor

* fix

* cleanup and (partial) docs

* Adding documentation and continuing cleanup (in progress)

* remove finetune.py script

* continued cleanup and documentation

* pytest fixes

* review comments

* fix

* Fix

* typing fixes

* make sure the batch dataset patcher for multipack is always loaded when handling datasets

* review comments

* fix

---------

Co-authored-by: Dan Saunders <dan@axolotl.ai>
Co-authored-by: Wing Lian <wing@axolotl.ai>
2025-01-13 17:55:29 +00:00
Wing Lian
f89e962119 skip over rows in pretraining dataset (#2223)
* skip over rows in pretraining dataset

* update docs
2025-01-13 10:44:45 -05:00
Wing Lian
bc1c9c20e3 assume empty lora dropout means 0.0 and add tests (#2243)
* assume empty lora dropout means 0.0 and add tests

* remove un-necessary arg

* refactor based on pr feedback:

* chore: lint
2025-01-13 10:44:11 -05:00