Default Branch

c6da9b9e92 · Update SETUP_MIAAI.md: add bare Ubuntu rebuild section (driver, packages, Ollama) · Updated 2026-05-13 21:33:02 +00:00

Branches

105c65390e · add q-galore optimizer · Updated 2024-07-14 23:28:13 +00:00    tocmo0nlord

1236
1

98af5388ba · bump flash attention 2.5.8 -> 2.6.1 (#1738) · Updated 2024-07-14 23:11:31 +00:00    tocmo0nlord

1237
0
Included

2680421081 · bump deepspeed to latest 0.14.4 · Updated 2024-07-13 18:36:18 +00:00    tocmo0nlord

1238
1

469e15607d · basic llama multipack · Updated 2024-06-20 18:39:55 +00:00    tocmo0nlord

1253
1

d7ec10e337 · add support for MoRA · Updated 2024-06-01 20:14:56 +00:00    tocmo0nlord

1268
1

e9a1f288cf · support for custom trainer_cls from config · Updated 2024-05-14 22:57:53 +00:00    tocmo0nlord

1306
1

7c5aa4791f · drop position_ids for olmo model · Updated 2024-05-09 04:25:15 +00:00    tocmo0nlord

1314
1

317761406e · add support for NCA · Updated 2024-05-06 21:01:14 +00:00    tocmo0nlord

1318
9

6a9ac4ad27 · consistency w sppo -> sppo_hard · Updated 2024-05-06 20:58:58 +00:00    tocmo0nlord

1318
8

7a7c56f018 · fixes to support fsdp-qdora · Updated 2024-04-23 12:37:04 +00:00    tocmo0nlord

1330
1

3ce9b0760b · fix the lora yaml for l3 · Updated 2024-04-19 11:28:07 +00:00    tocmo0nlord

1336
1

4c92b51cd5 · fix the torch dtype check · Updated 2024-04-11 12:56:46 +00:00    tocmo0nlord

1346
2

3202f19f52 · add save_only_model arg · Updated 2024-04-10 20:09:08 +00:00    tocmo0nlord

1346
1

f8bb4185bc · skip s2 attention test due to timeout · Updated 2024-04-08 22:33:33 +00:00    tocmo0nlord

1353
1

744f7082f5 · fix for fsdp for models that aren't qwen2 or jamba · Updated 2024-04-06 00:02:54 +00:00    tocmo0nlord

1357
1

05f7034288 · use deterministic seed for random LISA layers · Updated 2024-04-05 01:16:55 +00:00    tocmo0nlord

1360
1

dfe591435f · make lisa training example work on one 24gb gpu · Updated 2024-04-02 03:19:54 +00:00    tocmo0nlord

1370
6

ff939d8a64 · fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path (#1298) · Updated 2024-03-25 06:34:54 +00:00    tocmo0nlord

1386
0
Included

e6b78c1fca · override the entire create_optimzier method · Updated 2024-03-20 03:19:56 +00:00    tocmo0nlord

1395
2

10328b3429 · Simplify creating parameters · Updated 2024-03-18 12:32:59 +00:00    tocmo0nlord

1404
11