axolotl

Author	SHA1	Message	Date
Akshay Jain	dd7d16d2eb	Update FAQS.md Updated FAQS.md with backticks around error message	2023-06-10 19:15:50 -07:00
NanoCode012	e285e24f7f	Address PR suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-06-11 10:52:12 +09:00
NanoCode012	919727b4d7	Refactor landmark attention patch	2023-06-11 10:51:05 +09:00
Akshay Jain	5ffefee37f	Update FAQS.md Update FAQS.md with the following statement Error invalid argument at line 359 in file /workspace/bitsandbytes/csrc/pythonInterface.c /arrow/cpp/src/arrow/filesystem/s3fs.cc:2598: arrow::fs::FinalizeS3 was not called even though S3 was initialized. This could lead to a segmentation fault at exit try reinstalling bitsandbytes and transformers from source	2023-06-10 18:34:54 -07:00
Wing Lian	d9f713e4e3	Merge pull request #183 from OpenAccess-AI-Collective/inference-from-stdin pass a prompt in from stdin for inference	2023-06-10 17:06:55 -04:00
Wing Lian	958da70376	fix formatting	2023-06-10 15:28:08 -04:00
Wing Lian	c4e4f8115c	pass a prompt in from stdin for inference	2023-06-10 15:07:40 -04:00
Angainor Development	a808bf913f	Fix missing cfg.	2023-06-10 20:28:49 +02:00
Wing Lian	01248253a3	Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:25:51 -04:00
Wing Lian	759e8673ce	Update scripts/finetune.py Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-06-10 14:25:21 -04:00
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	eea2731a5e	add streaming dataset support for pretraining datasets	2023-06-10 14:23:56 -04:00
Wing Lian	1db46a9c72	linting fix	2023-06-10 14:23:56 -04:00
Wing Lian	ab5cd28acf	more gpt-neox long ctx fixes	2023-06-10 14:23:55 -04:00
Wing Lian	1a82082e91	fix bettertransformers save, force it to skip after saving correctly in callback	2023-06-10 14:23:55 -04:00
Wing Lian	1210dc8fd5	more tweaks to do pre-training with bettertransformers	2023-06-10 14:23:55 -04:00
Wing Lian	488a67d75a	experimental expansion of ctx len	2023-06-10 14:23:53 -04:00
Wing Lian	71a43f8479	add validation/warning for bettertransformers and torch version	2023-06-10 14:22:31 -04:00
Wing Lian	39619028a3	use pythia-12b, neox-20b is flaky	2023-06-10 14:22:30 -04:00
Wing Lian	8792199799	add flash attn context for efficient training and attempt setting model to train mode:	2023-06-10 14:22:30 -04:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	14163c15d9	fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:11:13 -04:00
Wing Lian	41e4f6ca31	Merge pull request #181 from OpenAccess-AI-Collective/xpos-rope add support to extend context with xpos rope	2023-06-10 14:04:03 -04:00
Angainor Development	79e2a6f140	Merge branch 'main' into patch-1	2023-06-10 19:07:54 +02:00
Angainor Development	c2508987a6	Remove explicit definition of cfg.inference	2023-06-10 19:06:10 +02:00
Wing Lian	215d775147	Merge pull request #180 from Glavin001/feat/stream-inference Add streaming inference & fix stopping at EOS	2023-06-10 12:04:34 -04:00
Wing Lian	f36e227eaf	formatting for linter	2023-06-10 12:00:52 -04:00
Wing Lian	5878bb1f3a	add option to readme	2023-06-10 11:57:41 -04:00
Wing Lian	a03a7d7d8b	add support to extend context with xpos rope	2023-06-10 10:29:46 -04:00
Glavin Wiechert	fec6bcc3e6	Add streaming inference & fix stopping at EOS	2023-06-10 08:14:47 +00:00
Wing Lian	931e606459	Merge pull request #179 from OpenAccess-AI-Collective/fix-max_seq_len fix for max sequence len across different model types	2023-06-09 20:52:03 -04:00
Wing Lian	7f09106437	fix for max sequence len across different model types	2023-06-09 20:42:33 -04:00
NanoCode012	6b50200234	Merge pull request #178 from PocketDocLabs/main Update README.md to reflect current gradient checkpointing support	2023-06-10 08:26:48 +09:00
PocketDocLabs	16f9e28048	Update README.md to reflect current gradient checkpointing support Previously the readme stated gradient checkpointing was incompatible with 4-bit lora in the current implementation however this is no longer the case. I have replaced the warning with a link to the hugging face documentation on gradient checkpointing.	2023-06-09 16:10:58 -07:00
NanoCode012	b9083a7fc1	Merge pull request #176 from NanoCode012/fix/peft-import Fix backward compat for peft	2023-06-10 07:56:35 +09:00
NanoCode012	aefb2fc681	Fix backward compat for peft	2023-06-10 07:46:36 +09:00
NanoCode012	b5aa8d854c	Merge pull request #169 from NanoCode012/feat/landmark Feat: Add landmark attention	2023-06-10 07:26:06 +09:00
NanoCode012	4d6490bce2	Merge pull request #171 from OpenAccess-AI-Collective/NanoCode012-falcon-lora-matrix Fix falcon support lora	2023-06-09 17:58:22 +09:00
NanoCode012	b242b69e10	Fix falcon support lora	2023-06-09 17:50:16 +09:00
NanoCode012	320beb20f4	Merge pull request #170 from OpenAccess-AI-Collective/NanoCode012-lambdalabs-fix Feat: Improve lambda labs instruction	2023-06-09 16:52:27 +09:00
Angainor Development	bd3b537344	Feed cfg.inference	2023-06-09 08:59:05 +02:00
Angainor Development	813cfa4c14	WIP: Rely on cfg.inference	2023-06-09 08:49:32 +02:00
NanoCode012	2e13ceff37	Improve lambda labs instruction	2023-06-09 15:03:08 +09:00
NanoCode012	2a801b001a	Fix grad checkpoint and outputs param	2023-06-09 14:28:44 +09:00
NanoCode012	e44c9e0b3e	Fix patching via import instead of hijacking	2023-06-09 14:27:24 +09:00
NanoCode012	55b8542de8	Feat: Add landmark attention	2023-06-09 12:54:08 +09:00
Wing Lian	febe902517	Merge pull request #168 from bratao/main Disable Wandb if no wandb project is specified	2023-06-08 22:05:56 -04:00
Bruno Cabral	f4df266842	Disable Wandb	2023-06-08 21:02:02 -03:00
NanoCode012	281dc3df59	Merge pull request #167 from NanoCode012/fix/redundant-save-eval-steps Fix: Refactor out unmodified save_steps and eval_steps	2023-06-09 01:39:33 +09:00
NanoCode012	2ef4634d45	Refactor out unmodified save_steps and eval_steps	2023-06-09 01:23:13 +09:00

... 14 15 16 17 18 ...

1292 Commits