Built site for gh-pages
This commit is contained in:
10
search.json
10
search.json
@@ -698,14 +698,14 @@
|
||||
"href": "docs/api/prompt_strategies.chat_template.html",
|
||||
"title": "prompt_strategies.chat_template",
|
||||
"section": "",
|
||||
"text": "prompt_strategies.chat_template\nHF Chat Templates prompt strategy\n\n\n\n\n\nName\nDescription\n\n\n\n\nChatTemplatePrompter\nPrompter for HF chat templates\n\n\nChatTemplateStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nMistralPrompter\nMistral prompter for chat template.\n\n\nMistralStrategy\nMistral strategy for chat template.\n\n\nStrategyLoader\nLoad chat template strategy based on configuration.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter(\n tokenizer,\n chat_template,\n processor=None,\n max_length=2048,\n message_property_mappings=None,\n message_field_training=None,\n message_field_training_detail=None,\n field_messages='messages',\n field_system='system',\n field_tools='tools',\n roles=None,\n chat_template_kwargs=None,\n drop_system_message=False,\n)\nPrompter for HF chat templates\n\n\n\n\n\nName\nDescription\n\n\n\n\nbuild_prompt\nBuild a prompt from a conversation.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter.build_prompt(\n conversation,\n add_generation_prompt=False,\n images=None,\n tools=None,\n)\nBuild a prompt from a conversation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nconversation\nlist[dict]\nA list of messages.\nrequired\n\n\nadd_generation_prompt\n\nWhether to add a generation prompt.\nFalse\n\n\nimages\n\nA list of images. (optional)\nNone\n\n\ntools\n\nA list of tools. (optional)\nNone\n\n\n\n\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\nfind_turn\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\ntokenize_prompt\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_turn(\n turns,\n turn_idx,\n tools=None,\n)\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.MistralPrompter(*args, **kwargs)\nMistral prompter for chat template.\n\n\n\nprompt_strategies.chat_template.MistralStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nMistral strategy for chat template.\n\n\n\n\n\nName\nDescription\n\n\n\n\nsupports_multiprocessing\nWhether this tokenizing strategy supports multiprocessing.\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.MistralStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.StrategyLoader()\nLoad chat template strategy based on configuration."
|
||||
"text": "prompt_strategies.chat_template\nHF Chat Templates prompt strategy\n\n\n\n\n\nName\nDescription\n\n\n\n\nChatTemplatePrompter\nPrompter for HF chat templates\n\n\nChatTemplateStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nMistralPrompter\nMistral prompter for chat template.\n\n\nMistralStrategy\nMistral strategy for chat template.\n\n\nStrategyLoader\nLoad chat template strategy based on configuration.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter(\n tokenizer,\n chat_template,\n processor=None,\n max_length=2048,\n message_property_mappings=None,\n message_field_training=None,\n message_field_training_detail=None,\n field_messages='messages',\n field_system='system',\n field_tools='tools',\n roles=None,\n chat_template_kwargs=None,\n drop_system_message=False,\n)\nPrompter for HF chat templates\n\n\n\n\n\nName\nDescription\n\n\n\n\nbuild_prompt\nBuild a prompt from a conversation.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter.build_prompt(\n conversation,\n add_generation_prompt=False,\n images=None,\n tools=None,\n)\nBuild a prompt from a conversation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nconversation\nlist[dict]\nA list of messages.\nrequired\n\n\nadd_generation_prompt\n\nWhether to add a generation prompt.\nFalse\n\n\nimages\n\nA list of images. (optional)\nNone\n\n\ntools\n\nA list of tools. (optional)\nNone\n\n\n\n\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\nfind_turn\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\ntokenize_prompt\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_turn(\n turns,\n turn_idx,\n tools=None,\n)\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.MistralPrompter(*args, **kwargs)\nMistral prompter for chat template.\n\n\n\nprompt_strategies.chat_template.MistralStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nMistral strategy for chat template.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.MistralStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.StrategyLoader()\nLoad chat template strategy based on configuration."
|
||||
},
|
||||
{
|
||||
"objectID": "docs/api/prompt_strategies.chat_template.html#classes",
|
||||
"href": "docs/api/prompt_strategies.chat_template.html#classes",
|
||||
"title": "prompt_strategies.chat_template",
|
||||
"section": "",
|
||||
"text": "Name\nDescription\n\n\n\n\nChatTemplatePrompter\nPrompter for HF chat templates\n\n\nChatTemplateStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nMistralPrompter\nMistral prompter for chat template.\n\n\nMistralStrategy\nMistral strategy for chat template.\n\n\nStrategyLoader\nLoad chat template strategy based on configuration.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter(\n tokenizer,\n chat_template,\n processor=None,\n max_length=2048,\n message_property_mappings=None,\n message_field_training=None,\n message_field_training_detail=None,\n field_messages='messages',\n field_system='system',\n field_tools='tools',\n roles=None,\n chat_template_kwargs=None,\n drop_system_message=False,\n)\nPrompter for HF chat templates\n\n\n\n\n\nName\nDescription\n\n\n\n\nbuild_prompt\nBuild a prompt from a conversation.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter.build_prompt(\n conversation,\n add_generation_prompt=False,\n images=None,\n tools=None,\n)\nBuild a prompt from a conversation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nconversation\nlist[dict]\nA list of messages.\nrequired\n\n\nadd_generation_prompt\n\nWhether to add a generation prompt.\nFalse\n\n\nimages\n\nA list of images. (optional)\nNone\n\n\ntools\n\nA list of tools. (optional)\nNone\n\n\n\n\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\nfind_turn\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\ntokenize_prompt\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_turn(\n turns,\n turn_idx,\n tools=None,\n)\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.MistralPrompter(*args, **kwargs)\nMistral prompter for chat template.\n\n\n\nprompt_strategies.chat_template.MistralStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nMistral strategy for chat template.\n\n\n\n\n\nName\nDescription\n\n\n\n\nsupports_multiprocessing\nWhether this tokenizing strategy supports multiprocessing.\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.MistralStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.StrategyLoader()\nLoad chat template strategy based on configuration."
|
||||
"text": "Name\nDescription\n\n\n\n\nChatTemplatePrompter\nPrompter for HF chat templates\n\n\nChatTemplateStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nMistralPrompter\nMistral prompter for chat template.\n\n\nMistralStrategy\nMistral strategy for chat template.\n\n\nStrategyLoader\nLoad chat template strategy based on configuration.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter(\n tokenizer,\n chat_template,\n processor=None,\n max_length=2048,\n message_property_mappings=None,\n message_field_training=None,\n message_field_training_detail=None,\n field_messages='messages',\n field_system='system',\n field_tools='tools',\n roles=None,\n chat_template_kwargs=None,\n drop_system_message=False,\n)\nPrompter for HF chat templates\n\n\n\n\n\nName\nDescription\n\n\n\n\nbuild_prompt\nBuild a prompt from a conversation.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplatePrompter.build_prompt(\n conversation,\n add_generation_prompt=False,\n images=None,\n tools=None,\n)\nBuild a prompt from a conversation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nconversation\nlist[dict]\nA list of messages.\nrequired\n\n\nadd_generation_prompt\n\nWhether to add a generation prompt.\nFalse\n\n\nimages\n\nA list of images. (optional)\nNone\n\n\ntools\n\nA list of tools. (optional)\nNone\n\n\n\n\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\nfind_turn\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\ntokenize_prompt\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.find_turn(\n turns,\n turn_idx,\n tools=None,\n)\nLocate the starting and ending indices of the specified turn in a conversation.\n\n\n\nprompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)\nPublic method that can handle either a single prompt or a batch of prompts.\n\n\n\n\n\nprompt_strategies.chat_template.MistralPrompter(*args, **kwargs)\nMistral prompter for chat template.\n\n\n\nprompt_strategies.chat_template.MistralStrategy(\n prompter,\n tokenizer,\n train_on_inputs,\n sequence_len,\n roles_to_train=None,\n train_on_eos=None,\n train_on_eot=None,\n eot_tokens=None,\n split_thinking=False,\n)\nMistral strategy for chat template.\n\n\n\n\n\nName\nDescription\n\n\n\n\nfind_first_eot_token\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.MistralStrategy.find_first_eot_token(\n input_ids,\n start_idx,\n)\nFind the first EOT token in the input_ids starting from start_idx.\n\n\n\n\n\nprompt_strategies.chat_template.StrategyLoader()\nLoad chat template strategy based on configuration."
|
||||
},
|
||||
{
|
||||
"objectID": "docs/api/prompt_strategies.kto.user_defined.html",
|
||||
@@ -2959,14 +2959,14 @@
|
||||
"href": "docs/api/prompt_tokenizers.html",
|
||||
"title": "prompt_tokenizers",
|
||||
"section": "",
|
||||
"text": "prompt_tokenizers\nModule containing PromptTokenizingStrategy and Prompter classes\n\n\n\n\n\nName\nDescription\n\n\n\n\nAlpacaMultipleChoicePromptTokenizingStrategy\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\nAlpacaPromptTokenizingStrategy\nTokenizing strategy for Alpaca prompts.\n\n\nAlpacaReflectionPTStrategy\nTokenizing strategy for Alpaca Reflection prompts.\n\n\nDatasetWrappingStrategy\nAbstract class for wrapping datasets for Chat Messages\n\n\nGPTeacherPromptTokenizingStrategy\nTokenizing strategy for GPTeacher prompts.\n\n\nInstructionPromptTokenizingStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nInvalidDataException\nException raised when the data is invalid\n\n\nJeopardyPromptTokenizingStrategy\nTokenizing strategy for Jeopardy prompts.\n\n\nNomicGPT4AllPromptTokenizingStrategy\nTokenizing strategy for NomicGPT4All prompts.\n\n\nOpenAssistantPromptTokenizingStrategy\nTokenizing strategy for OpenAssistant prompts.\n\n\nPromptTokenizingStrategy\nAbstract class for tokenizing strategies\n\n\nReflectionPromptTokenizingStrategy\nTokenizing strategy for Reflection prompts.\n\n\nSummarizeTLDRPromptTokenizingStrategy\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\nprompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\n\nprompt_tokenizers.AlpacaPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca prompts.\n\n\n\nprompt_tokenizers.AlpacaReflectionPTStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Reflection prompts.\n\n\n\nprompt_tokenizers.DatasetWrappingStrategy()\nAbstract class for wrapping datasets for Chat Messages\n\n\n\nprompt_tokenizers.GPTeacherPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for GPTeacher prompts.\n\n\n\nprompt_tokenizers.InstructionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\nprompt_tokenizers.InvalidDataException()\nException raised when the data is invalid\n\n\n\nprompt_tokenizers.JeopardyPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Jeopardy prompts.\n\n\n\nprompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for NomicGPT4All prompts.\n\n\n\nprompt_tokenizers.OpenAssistantPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for OpenAssistant prompts.\n\n\n\nprompt_tokenizers.PromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nAbstract class for tokenizing strategies\n\n\n\n\n\nName\nDescription\n\n\n\n\nsupports_multiprocessing\nWhether this tokenizing strategy supports multiprocessing.\n\n\n\n\n\n\n\nprompt_tokenizers.ReflectionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Reflection prompts.\n\n\n\nprompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\nparse_tokenized_to_result\nParses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result\n\n\ntokenize_prompt_default\nReturns the default values for the tokenize prompt function\n\n\n\n\n\nprompt_tokenizers.parse_tokenized_to_result(\n result,\n current_len,\n res,\n labels,\n pad_token_id=None,\n)\nParses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result\n\n\n\nprompt_tokenizers.tokenize_prompt_default()\nReturns the default values for the tokenize prompt function"
|
||||
"text": "prompt_tokenizers\nModule containing PromptTokenizingStrategy and Prompter classes\n\n\n\n\n\nName\nDescription\n\n\n\n\nAlpacaMultipleChoicePromptTokenizingStrategy\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\nAlpacaPromptTokenizingStrategy\nTokenizing strategy for Alpaca prompts.\n\n\nAlpacaReflectionPTStrategy\nTokenizing strategy for Alpaca Reflection prompts.\n\n\nDatasetWrappingStrategy\nAbstract class for wrapping datasets for Chat Messages\n\n\nGPTeacherPromptTokenizingStrategy\nTokenizing strategy for GPTeacher prompts.\n\n\nInstructionPromptTokenizingStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nInvalidDataException\nException raised when the data is invalid\n\n\nJeopardyPromptTokenizingStrategy\nTokenizing strategy for Jeopardy prompts.\n\n\nNomicGPT4AllPromptTokenizingStrategy\nTokenizing strategy for NomicGPT4All prompts.\n\n\nOpenAssistantPromptTokenizingStrategy\nTokenizing strategy for OpenAssistant prompts.\n\n\nPromptTokenizingStrategy\nAbstract class for tokenizing strategies\n\n\nReflectionPromptTokenizingStrategy\nTokenizing strategy for Reflection prompts.\n\n\nSummarizeTLDRPromptTokenizingStrategy\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\nprompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\n\nprompt_tokenizers.AlpacaPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca prompts.\n\n\n\nprompt_tokenizers.AlpacaReflectionPTStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Reflection prompts.\n\n\n\nprompt_tokenizers.DatasetWrappingStrategy()\nAbstract class for wrapping datasets for Chat Messages\n\n\n\nprompt_tokenizers.GPTeacherPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for GPTeacher prompts.\n\n\n\nprompt_tokenizers.InstructionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\nprompt_tokenizers.InvalidDataException()\nException raised when the data is invalid\n\n\n\nprompt_tokenizers.JeopardyPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Jeopardy prompts.\n\n\n\nprompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for NomicGPT4All prompts.\n\n\n\nprompt_tokenizers.OpenAssistantPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for OpenAssistant prompts.\n\n\n\nprompt_tokenizers.PromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nAbstract class for tokenizing strategies\n\n\n\nprompt_tokenizers.ReflectionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Reflection prompts.\n\n\n\nprompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\nparse_tokenized_to_result\nParses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result\n\n\ntokenize_prompt_default\nReturns the default values for the tokenize prompt function\n\n\n\n\n\nprompt_tokenizers.parse_tokenized_to_result(\n result,\n current_len,\n res,\n labels,\n pad_token_id=None,\n)\nParses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result\n\n\n\nprompt_tokenizers.tokenize_prompt_default()\nReturns the default values for the tokenize prompt function"
|
||||
},
|
||||
{
|
||||
"objectID": "docs/api/prompt_tokenizers.html#classes",
|
||||
"href": "docs/api/prompt_tokenizers.html#classes",
|
||||
"title": "prompt_tokenizers",
|
||||
"section": "",
|
||||
"text": "Name\nDescription\n\n\n\n\nAlpacaMultipleChoicePromptTokenizingStrategy\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\nAlpacaPromptTokenizingStrategy\nTokenizing strategy for Alpaca prompts.\n\n\nAlpacaReflectionPTStrategy\nTokenizing strategy for Alpaca Reflection prompts.\n\n\nDatasetWrappingStrategy\nAbstract class for wrapping datasets for Chat Messages\n\n\nGPTeacherPromptTokenizingStrategy\nTokenizing strategy for GPTeacher prompts.\n\n\nInstructionPromptTokenizingStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nInvalidDataException\nException raised when the data is invalid\n\n\nJeopardyPromptTokenizingStrategy\nTokenizing strategy for Jeopardy prompts.\n\n\nNomicGPT4AllPromptTokenizingStrategy\nTokenizing strategy for NomicGPT4All prompts.\n\n\nOpenAssistantPromptTokenizingStrategy\nTokenizing strategy for OpenAssistant prompts.\n\n\nPromptTokenizingStrategy\nAbstract class for tokenizing strategies\n\n\nReflectionPromptTokenizingStrategy\nTokenizing strategy for Reflection prompts.\n\n\nSummarizeTLDRPromptTokenizingStrategy\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\nprompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\n\nprompt_tokenizers.AlpacaPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca prompts.\n\n\n\nprompt_tokenizers.AlpacaReflectionPTStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Reflection prompts.\n\n\n\nprompt_tokenizers.DatasetWrappingStrategy()\nAbstract class for wrapping datasets for Chat Messages\n\n\n\nprompt_tokenizers.GPTeacherPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for GPTeacher prompts.\n\n\n\nprompt_tokenizers.InstructionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\nprompt_tokenizers.InvalidDataException()\nException raised when the data is invalid\n\n\n\nprompt_tokenizers.JeopardyPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Jeopardy prompts.\n\n\n\nprompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for NomicGPT4All prompts.\n\n\n\nprompt_tokenizers.OpenAssistantPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for OpenAssistant prompts.\n\n\n\nprompt_tokenizers.PromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nAbstract class for tokenizing strategies\n\n\n\n\n\nName\nDescription\n\n\n\n\nsupports_multiprocessing\nWhether this tokenizing strategy supports multiprocessing.\n\n\n\n\n\n\n\nprompt_tokenizers.ReflectionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Reflection prompts.\n\n\n\nprompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for SummarizeTLDR prompts."
|
||||
"text": "Name\nDescription\n\n\n\n\nAlpacaMultipleChoicePromptTokenizingStrategy\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\nAlpacaPromptTokenizingStrategy\nTokenizing strategy for Alpaca prompts.\n\n\nAlpacaReflectionPTStrategy\nTokenizing strategy for Alpaca Reflection prompts.\n\n\nDatasetWrappingStrategy\nAbstract class for wrapping datasets for Chat Messages\n\n\nGPTeacherPromptTokenizingStrategy\nTokenizing strategy for GPTeacher prompts.\n\n\nInstructionPromptTokenizingStrategy\nTokenizing strategy for instruction-based prompts.\n\n\nInvalidDataException\nException raised when the data is invalid\n\n\nJeopardyPromptTokenizingStrategy\nTokenizing strategy for Jeopardy prompts.\n\n\nNomicGPT4AllPromptTokenizingStrategy\nTokenizing strategy for NomicGPT4All prompts.\n\n\nOpenAssistantPromptTokenizingStrategy\nTokenizing strategy for OpenAssistant prompts.\n\n\nPromptTokenizingStrategy\nAbstract class for tokenizing strategies\n\n\nReflectionPromptTokenizingStrategy\nTokenizing strategy for Reflection prompts.\n\n\nSummarizeTLDRPromptTokenizingStrategy\nTokenizing strategy for SummarizeTLDR prompts.\n\n\n\n\n\nprompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Multiple Choice prompts.\n\n\n\nprompt_tokenizers.AlpacaPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca prompts.\n\n\n\nprompt_tokenizers.AlpacaReflectionPTStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Alpaca Reflection prompts.\n\n\n\nprompt_tokenizers.DatasetWrappingStrategy()\nAbstract class for wrapping datasets for Chat Messages\n\n\n\nprompt_tokenizers.GPTeacherPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for GPTeacher prompts.\n\n\n\nprompt_tokenizers.InstructionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for instruction-based prompts.\n\n\n\nprompt_tokenizers.InvalidDataException()\nException raised when the data is invalid\n\n\n\nprompt_tokenizers.JeopardyPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Jeopardy prompts.\n\n\n\nprompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for NomicGPT4All prompts.\n\n\n\nprompt_tokenizers.OpenAssistantPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for OpenAssistant prompts.\n\n\n\nprompt_tokenizers.PromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nAbstract class for tokenizing strategies\n\n\n\nprompt_tokenizers.ReflectionPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for Reflection prompts.\n\n\n\nprompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(\n prompter,\n tokenizer,\n train_on_inputs=False,\n sequence_len=2048,\n)\nTokenizing strategy for SummarizeTLDR prompts."
|
||||
},
|
||||
{
|
||||
"objectID": "docs/api/prompt_tokenizers.html#functions",
|
||||
@@ -3064,7 +3064,7 @@
|
||||
"href": "docs/custom_integrations.html#cut-cross-entropy",
|
||||
"title": "Custom Integrations",
|
||||
"section": "Cut Cross Entropy",
|
||||
"text": "Cut Cross Entropy\nCut Cross Entropy (CCE) reduces VRAM usage through optimization on the cross-entropy operation during loss calculation.\nSee https://github.com/apple/ml-cross-entropy\n\nRequirements\n\nPyTorch 2.4.0 or higher\n\n\n\nInstallation\nRun the following command to install cut_cross_entropy[transformers] if you don’t have it already.\n\nIf you are in dev environment\n\npython scripts/cutcrossentropy_install.py | sh\n\nIf you are installing from pip\n\npip3 uninstall -y cut-cross-entropy && pip3 install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@622068a\"\n\n\nUsage\nplugins:\n - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin\n\n\nSupported Models\n\ncohere\ncohere2\ngemma\ngemma2\ngemma3\ngemma3_text\nglm\nglm4\nllama\nllama4\nllama4_text\nmistral\nmistral3\nmllama\nphi\nphi3\nphi4_multimodal\nqwen2\nqwen2_vl\nqwen2_moe\nqwen2_5_vl\nqwen3\nqwen3_moe\n\n\n\nCitation\n@article{wijmans2024cut,\n author = {Erik Wijmans and\n Brody Huval and\n Alexander Hertzberg and\n Vladlen Koltun and\n Philipp Kr\\\"ahenb\\\"uhl},\n title = {Cut Your Losses in Large-Vocabulary Language Models},\n journal = {arXiv},\n year = {2024},\n url = {https://arxiv.org/abs/2411.09009},\n}\nPlease see reference here",
|
||||
"text": "Cut Cross Entropy\nCut Cross Entropy (CCE) reduces VRAM usage through optimization on the cross-entropy operation during loss calculation.\nSee https://github.com/apple/ml-cross-entropy\n\nRequirements\n\nPyTorch 2.4.0 or higher\n\n\n\nInstallation\nRun the following command to install cut_cross_entropy[transformers] if you don’t have it already.\n\nIf you are in dev environment\n\npython scripts/cutcrossentropy_install.py | sh\n\nIf you are installing from pip\n\npip3 uninstall -y cut-cross-entropy && pip3 install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@865b899\"\n\n\nUsage\nplugins:\n - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin\n\n\nSupported Models\n\ncohere\ncohere2\ngemma\ngemma2\ngemma3\ngemma3_text\nglm\nglm4\nllama\nllama4\nllama4_text\nmistral\nmistral3\nmllama\nphi\nphi3\nphi4_multimodal\nqwen2\nqwen2_vl\nqwen2_moe\nqwen2_5_vl\nqwen3\nqwen3_moe\n\n\n\nCitation\n@article{wijmans2024cut,\n author = {Erik Wijmans and\n Brody Huval and\n Alexander Hertzberg and\n Vladlen Koltun and\n Philipp Kr\\\"ahenb\\\"uhl},\n title = {Cut Your Losses in Large-Vocabulary Language Models},\n journal = {arXiv},\n year = {2024},\n url = {https://arxiv.org/abs/2411.09009},\n}\nPlease see reference here",
|
||||
"crumbs": [
|
||||
"Advanced Features",
|
||||
"Custom Integrations"
|
||||
|
||||
Reference in New Issue
Block a user