Wing Lian
|
2c9dfbed2e
|
apply z-score scaling to kd
|
2025-01-27 14:27:35 -05:00 |
|
Wing Lian
|
4e4a16cd8a
|
fix finding the top-k rather than assuming first position has the correct val
|
2025-01-21 13:09:20 -05:00 |
|
Wing Lian
|
67c1c8405e
|
use iter instead of tuple
|
2025-01-21 11:23:38 -05:00 |
|
Wing Lian
|
bded6df509
|
change up logic so we always truncate to top_k
|
2025-01-21 11:20:01 -05:00 |
|
Wing Lian
|
bb5e6f4b72
|
make sure to truncate logprobs if there are more than top_k
|
2025-01-21 10:26:27 -05:00 |
|
Wing Lian
|
32258c247e
|
no batching for kd chat templates
|
2025-01-15 08:22:29 -05:00 |
|
Wing Lian
|
04efcb102f
|
don't shift student logits for kd
|
2025-01-15 01:07:48 -05:00 |
|
Wing Lian
|
483defb9ae
|
try tests for kd on l40s
|
2025-01-14 23:56:00 -05:00 |
|
Wing Lian
|
35a84f2cb8
|
more fixes
|
2025-01-14 22:47:49 -05:00 |
|
Wing Lian
|
510cf45317
|
improve logprob masking and shift in trainer
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
7232cbdeab
|
chore: lint
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
e8fceb7091
|
chore: lint
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
a5e0671738
|
make sure to use tensorboard to capture loss for checks
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
b9847553af
|
fix adapter model check
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
513ec9e03b
|
make sure to use the correct tokenizer
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
530347856d
|
make sure to set tokenizer from l3 70b and save safetensors
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
261e4fb619
|
lower lr
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
158071e95f
|
set lora_dropout explicitly
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
432f65f5e6
|
make the kd e2e fit in vram for ci and add lora version
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
1d039f5486
|
rename test files so it gets picked up
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
b9a42b396f
|
linting
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
ff2fb0fc1b
|
add kd trainer e2e test
|
2025-01-14 22:47:47 -05:00 |
|
Wing Lian
|
317f290186
|
reward model doesn't work well with batched
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
ab690f3f01
|
improve check for batched
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
47932f21c4
|
fix reward trainer calls for tokenization
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
808328e041
|
reward can use same batch check
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
6784822cfb
|
tweak check for batched prompt data
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
684b38291f
|
ensure that batch vs single is done properly
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
01896b1bde
|
improve iterable support
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
e659c01646
|
support streaming for processing sft datasts?
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
204d6c43b4
|
make loss torch script compat
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
d3c2b7ce9d
|
kd sample packing
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
93dfff92f1
|
be a bit pickier about loading dynamic prompt strategies
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
6e409d2d88
|
more info on preprocess for kd and fix import
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
d5bc214300
|
remove duplicate code
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
92c6c1087e
|
add copyrights
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
feed96f95e
|
increase logging around loading plugins
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
cba6165ae1
|
make plugin setup concise
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
cdfcd69afa
|
remove moved class from import
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
885653d52e
|
move more things to kd plugin
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
27faacbf5a
|
refactor kd chat template loader
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
c51b0337c1
|
support for custom trainer classes from plugins
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
fa055f9f69
|
handle token/logprob shifting
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
f60c623af0
|
remove references to triton kd for now
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
746891eb5c
|
add license block
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
f09b5da60b
|
refactor so we can easily add new loss functions
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
689e1c10ba
|
chore: lint
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
a5c085e003
|
var naming and add todo
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
63146300b7
|
fix kd loss so it's causal (fixes repeating tokens)
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
ca5e397fc5
|
use kd_alpha in the correct loss method
|
2025-01-14 22:47:42 -05:00 |
|