Commit Graph

  • 09bf1ceacc update hf deps (#1964) Wing Lian 2024-10-12 18:19:48 -04:00
  • d5e79dc69a Built site for gh-pages Quarto GHA Workflow Runner 2024-10-11 17:35:09 +00:00
  • 3d309ccac1 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-11 17:34:22 +00:00
  • df359c8a6e Handle image input as string paths for MMLMs (#1958) Afrizal Hasbi Azizy 2024-10-12 00:34:13 +07:00
  • f340895a54 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-11 17:33:58 +00:00
  • 26b5463fc0 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-11 17:33:40 +00:00
  • 76883851d2 add warning that sharegpt will be deprecated (#1957) Wing Lian 2024-10-11 13:33:20 -04:00
  • 922db77521 Add MLFlow run name option in config (#1961) Adam Hazell 2024-10-11 18:33:06 +01:00
  • e73b8dff8d Add Support for revision Dataset Parameter to specify reading from Huggingface Dataset Revision (#1912) Thomas Cleberg 2024-10-11 12:32:50 -05:00
  • d1f36d7b78 set download to use revision feature/enable-huggingface-dataset-revision Wing Lian 2024-09-14 23:10:57 -04:00
  • 87248027d0 use revision tied to head Wing Lian 2024-09-14 13:22:09 -04:00
  • d0d22b7812 only use revision on hf hub backed datasets Wing Lian 2024-09-14 13:03:36 -04:00
  • 68db5b1b67 Add support for revision dataset parameter Thomas Cleberg 2024-09-12 16:34:58 -05:00
  • ec57918fcd Merge pull request #7 from NanoCode012/cj_tokenizer_default_prompt_template Chirag Jain 2024-10-11 14:44:25 +05:30
  • dd87d8c438 feat: add test for levy's dpo case NanoCode012 2024-10-11 12:56:46 +07:00
  • ef942b6efc fix: rename var after merge NanoCode012 2024-10-11 12:30:43 +07:00
  • 3c6a6c61be Merge branch 'main' into cj_tokenizer_default_prompt_template NanoCode012 2024-10-11 12:29:34 +07:00
  • 7b4b665e99 chore: skip duplicate NanoCode012 2024-10-11 11:42:36 +07:00
  • 21326e4ef3 chore: lint NanoCode012 2024-10-11 11:40:42 +07:00
  • de23dab4fc fix: config being dropped and unittest to catch that NanoCode012 2024-10-11 11:40:32 +07:00
  • e3efa29cf5 fix: test NanoCode012 2024-10-11 11:11:19 +07:00
  • 3e4fdfa332 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-10 19:58:35 +00:00
  • 2fbc6b0c64 Axo logo new (#1956) Wing Lian 2024-10-10 15:57:37 -04:00
  • a3445eb241 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-10 19:05:29 +00:00
  • 8159cbd1ab lm_eval harness post train (#1926) Wing Lian 2024-10-10 15:04:17 -04:00
  • 2038255052 Merge branch 'main' into cj_tokenizer_default_prompt_template NanoCode012 2024-10-10 20:25:37 +07:00
  • b415c720be Built site for gh-pages Quarto GHA Workflow Runner 2024-10-10 13:23:45 +00:00
  • 979534c851 add mistral templates (#1927) pandora 2024-10-10 15:22:53 +02:00
  • dab2590e4d chore: refactor NanoCode012 2024-10-10 18:07:00 +07:00
  • e5162b7a41 chore: added example for non-default template NanoCode012 2024-10-10 18:04:33 +07:00
  • b6321d2220 chore: clarify doc NanoCode012 2024-10-10 18:01:33 +07:00
  • 6b3cdfdb8e feat(doc): updated config with chat template options and clarified examples NanoCode012 2024-10-10 17:57:11 +07:00
  • 203ae28704 fix: refactor artifact left from main merge NanoCode012 2024-10-10 17:16:41 +07:00
  • ed3a33c9fb fix: re-arrange enum declaration position NanoCode012 2024-10-10 16:18:15 +07:00
  • f61e2fc7dc chore: remove redundant function NanoCode012 2024-10-10 16:15:15 +07:00
  • b8056d04d9 Merge branch 'main' into cj_tokenizer_default_prompt_template NanoCode012 2024-10-10 16:11:07 +07:00
  • 88658c0570 fix: set default to tokenizer template NanoCode012 2024-10-10 15:38:19 +07:00
  • 26593674bd Built site for gh-pages Quarto GHA Workflow Runner 2024-10-09 20:04:31 +00:00
  • 6d3caadf90 Comet integration (#1939) Boris Feld 2024-10-09 22:03:37 +02:00
  • dee77232fe fix type annotations (#1941) [skip ci] aarush gupta 2024-10-09 13:03:16 -07:00
  • a560593b1d fix(log): update perplexity log to clarify from eval split (#1952) [skip ci] NanoCode012 2024-10-10 03:02:32 +07:00
  • fb2cb0a714 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-09 15:54:54 +00:00
  • e8d3da0081 upgrade pytorch from 2.4.0 => 2.4.1 (#1950) Wing Lian 2024-10-09 11:53:56 -04:00
  • b344477ae3 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-09 12:44:32 +00:00
  • 4ca0a47cfb add 2.4.1 to base models (#1953) Wing Lian 2024-10-09 08:43:11 -04:00
  • cdd8be7097 wip on multimodal packing support mm3 sunny 2024-10-04 15:08:36 -04:00
  • 08143c7b0d wip on multimodal sample packing support sunny 2024-10-04 14:59:35 -04:00
  • a02af506ed pixtral example mm2 sunny 2024-10-03 16:11:15 -04:00
  • 431a0b0f9d added pixtral example sunny 2024-10-03 16:01:21 -04:00
  • ad40f7bb5b Built site for gh-pages Quarto GHA Workflow Runner 2024-10-03 01:03:42 +00:00
  • e1915f5625 Multimodal Vision Llama - rudimentary support (#1940) Wing Lian 2024-10-02 21:02:48 -04:00
  • 517ed33ac6 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-30 17:57:20 +00:00
  • 844331005c bump transformers to 4.45.1 (#1936) Wing Lian 2024-09-30 13:56:12 -04:00
  • e0043ff2fd Built site for gh-pages Quarto GHA Workflow Runner 2024-09-27 19:59:30 +00:00
  • 61aa291119 fix for empty lora+ lr embedding (#1932) Wing Lian 2024-09-27 15:58:35 -04:00
  • 5ec7a89db1 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-26 15:34:33 +00:00
  • b98d7d7098 update upstream deps versions and replace lora+ (#1928) Wing Lian 2024-09-26 11:33:41 -04:00
  • d18a859143 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-24 18:06:55 +00:00
  • d7eea2ff34 validation fixes 20240923 (#1925) Wing Lian 2024-09-24 14:05:58 -04:00
  • 17330c05a3 shampoo checkpoint save workaround shampoo Wing Lian 2024-09-23 15:21:00 -04:00
  • 992ea517b7 setup precision config for bf16 Wing Lian 2024-09-18 12:01:38 -07:00
  • beaee36191 ddp shampoo Wing Lian 2024-09-18 10:50:46 -07:00
  • 69a29382e1 fix casting of optim args Wing Lian 2024-09-18 10:48:15 -07:00
  • 84dad0bd12 ensure epsilon is cast to float Wing Lian 2024-09-18 10:42:26 -07:00
  • 05f61a0ea5 remove accidental duplidcated line Wing Lian 2024-09-18 10:41:03 -07:00
  • 5334d0fc01 fixes Wing Lian 2024-09-18 10:38:59 -07:00
  • 52e6249d2e additional grafting config types and basic example doc Wing Lian 2024-09-18 08:16:11 -07:00
  • eb3eab3450 wip shampoo optim support Wing Lian 2024-09-18 08:10:52 -07:00
  • cd32e6359c Built site for gh-pages Quarto GHA Workflow Runner 2024-09-14 12:23:48 +00:00
  • 7b9f669a3a Trigger the original tokenization behavior when no advanced turn settings are provided (#1915) Keith Stevens 2024-09-14 05:22:54 -07:00
  • 1dace7be56 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-14 02:20:50 +00:00
  • 5c42f11411 remove dynamic module loader monkeypatch as this was fixed upstream (#1914) Wing Lian 2024-09-13 22:19:54 -04:00
  • 260ca97f2c Merge branch 'main' into cj_tokenizer_default_prompt_template Chirag Jain 2024-09-13 00:33:49 +05:30
  • 8f47451e37 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-07 18:40:28 +00:00
  • 3853ab7ae9 bump accelerate to 0.34.2 (#1901) Wing Lian 2024-09-07 14:39:31 -04:00
  • 3363e53b7d Built site for gh-pages Quarto GHA Workflow Runner 2024-09-05 14:59:47 +00:00
  • 6e354682e3 fix zero3 integration (#1897) Wing Lian 2024-09-05 10:58:50 -04:00
  • 097ec6570f Built site for gh-pages Quarto GHA Workflow Runner 2024-09-05 14:12:25 +00:00
  • ab461d83c4 Fix documentation for pre-tokenized dataset (#1894) Alpay Ariyak 2024-09-05 07:11:31 -07:00
  • 253e9163db Built site for gh-pages Quarto GHA Workflow Runner 2024-09-05 13:59:14 +00:00
  • 93b769a979 lint fix and update gha regex (#1899) Wing Lian 2024-09-05 09:58:21 -04:00
  • 1738b1b50d Built site for gh-pages Quarto GHA Workflow Runner 2024-09-05 09:34:09 +00:00
  • f18f4268b5 Docs for AMD-based HPC systems (#1891) Tijmen de Haan 2024-09-05 18:33:19 +09:00
  • 6d0ae05d1c Built site for gh-pages Quarto GHA Workflow Runner 2024-09-04 15:29:44 +00:00
  • dca1fe47d4 fix optimizer + fsdp combination in example (#1893) Wing Lian 2024-09-04 11:28:47 -04:00
  • 144dd3a6f8 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-04 00:03:38 +00:00
  • 4e5400c732 support for auto_find_batch_size when packing (#1885) Wing Lian 2024-09-03 20:02:44 -04:00
  • 3ade0b81db add example yaml device-mesh Wing Lian 2024-09-01 21:20:48 -04:00
  • fb611bb0ea Built site for gh-pages Quarto GHA Workflow Runner 2024-09-01 23:30:29 +00:00
  • 0aeb277456 add e2e smoke tests for llama liger integration (#1884) Wing Lian 2024-09-01 19:29:37 -04:00
  • 81756037b0 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-01 22:35:13 +00:00
  • bdab3ec587 Fix RMSNorm monkey patch for Gemma models (#1886) Chiwan Park 2024-09-02 07:34:24 +09:00
  • 7d38e17120 Built site for gh-pages Quarto GHA Workflow Runner 2024-09-01 02:50:32 +00:00
  • 3c6b9eda2e run pytests with varied pytorch versions too (#1883) Wing Lian 2024-08-31 22:49:35 -04:00
  • 1ad5c229ae Built site for gh-pages Quarto GHA Workflow Runner 2024-09-01 02:00:43 +00:00
  • 15408d0f09 Update supported models for Liger Kernel (#1875) DocShotgun 2024-08-31 18:59:48 -07:00
  • ce33e1ed83 pin liger-kernel to latest 0.2.1 (#1882) [skip ci] Wing Lian 2024-08-30 17:51:18 -04:00
  • e3a38450de Add liger kernel to features (#1881) [skip ci] Byron Hsu 2024-08-29 05:19:18 -07:00
  • b1bb2accb9 Merge branch 'main' into cj_tokenizer_default_prompt_template Chirag Jain 2024-08-28 13:34:20 +05:30
  • e4515980c9 Built site for gh-pages Quarto GHA Workflow Runner 2024-08-28 03:53:32 +00:00