Wing Lian
|
f11227a35a
|
various fixes
|
2025-01-30 10:39:18 -05:00 |
|
Wing Lian
|
c434951dd6
|
Always re-normalize teacher distribution
|
2025-01-29 08:36:40 -05:00 |
|
Wing Lian
|
42d4732aaf
|
kd loss needs to be calculated in full precision
|
2025-01-28 19:40:35 -05:00 |
|
Wing Lian
|
2c9dfbed2e
|
apply z-score scaling to kd
|
2025-01-27 14:27:35 -05:00 |
|
Wing Lian
|
4e4a16cd8a
|
fix finding the top-k rather than assuming first position has the correct val
|
2025-01-21 13:09:20 -05:00 |
|
Wing Lian
|
67c1c8405e
|
use iter instead of tuple
|
2025-01-21 11:23:38 -05:00 |
|
Wing Lian
|
bded6df509
|
change up logic so we always truncate to top_k
|
2025-01-21 11:20:01 -05:00 |
|
Wing Lian
|
bb5e6f4b72
|
make sure to truncate logprobs if there are more than top_k
|
2025-01-21 10:26:27 -05:00 |
|
Wing Lian
|
32258c247e
|
no batching for kd chat templates
|
2025-01-15 08:22:29 -05:00 |
|
Wing Lian
|
04efcb102f
|
don't shift student logits for kd
|
2025-01-15 01:07:48 -05:00 |
|
Wing Lian
|
35a84f2cb8
|
more fixes
|
2025-01-14 22:47:49 -05:00 |
|
Wing Lian
|
510cf45317
|
improve logprob masking and shift in trainer
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
7232cbdeab
|
chore: lint
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
e8fceb7091
|
chore: lint
|
2025-01-14 22:47:48 -05:00 |
|
Wing Lian
|
317f290186
|
reward model doesn't work well with batched
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
ab690f3f01
|
improve check for batched
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
47932f21c4
|
fix reward trainer calls for tokenization
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
808328e041
|
reward can use same batch check
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
6784822cfb
|
tweak check for batched prompt data
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
684b38291f
|
ensure that batch vs single is done properly
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
01896b1bde
|
improve iterable support
|
2025-01-14 22:47:46 -05:00 |
|
Wing Lian
|
e659c01646
|
support streaming for processing sft datasts?
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
204d6c43b4
|
make loss torch script compat
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
d3c2b7ce9d
|
kd sample packing
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
93dfff92f1
|
be a bit pickier about loading dynamic prompt strategies
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
6e409d2d88
|
more info on preprocess for kd and fix import
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
d5bc214300
|
remove duplicate code
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
92c6c1087e
|
add copyrights
|
2025-01-14 22:47:45 -05:00 |
|
Wing Lian
|
feed96f95e
|
increase logging around loading plugins
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
cba6165ae1
|
make plugin setup concise
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
cdfcd69afa
|
remove moved class from import
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
885653d52e
|
move more things to kd plugin
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
27faacbf5a
|
refactor kd chat template loader
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
c51b0337c1
|
support for custom trainer classes from plugins
|
2025-01-14 22:47:44 -05:00 |
|
Wing Lian
|
fa055f9f69
|
handle token/logprob shifting
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
f60c623af0
|
remove references to triton kd for now
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
746891eb5c
|
add license block
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
f09b5da60b
|
refactor so we can easily add new loss functions
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
689e1c10ba
|
chore: lint
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
a5c085e003
|
var naming and add todo
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
63146300b7
|
fix kd loss so it's causal (fixes repeating tokens)
|
2025-01-14 22:47:43 -05:00 |
|
Wing Lian
|
ca5e397fc5
|
use kd_alpha in the correct loss method
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
3416302b0d
|
hash for temperature too
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
7366efc4ca
|
better rescaling for temperatures
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
d8d817eaed
|
don't use triton for now
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
c0757e8a20
|
fix kwarg
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
e565694914
|
v3
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
081928e55b
|
no torch.tensor
|
2025-01-14 22:47:42 -05:00 |
|
Wing Lian
|
dc90c93894
|
no log etc
|
2025-01-14 22:47:41 -05:00 |
|
Wing Lian
|
18a46c338a
|
no torch.exp inside triton kernel
|
2025-01-14 22:47:41 -05:00 |
|