Built site for gh-pages
This commit is contained in:
@@ -76,7 +76,7 @@ pre > code.sourceCode > span > a:first-child::before { text-decoration: underlin
|
||||
<link href="../../site_libs/quarto-html/quarto-syntax-highlighting-dark-b651517ce65839d647a86e2780455cfb.css" rel="stylesheet" id="quarto-text-highlighting-styles">
|
||||
<script src="../../site_libs/bootstrap/bootstrap.min.js"></script>
|
||||
<link href="../../site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
|
||||
<link href="../../site_libs/bootstrap/bootstrap-f9d679a32da2b248d4ca48a0e58e089e.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
|
||||
<link href="../../site_libs/bootstrap/bootstrap-08d9eb451d58809f35fda8b852d737d8.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
|
||||
<script id="quarto-search-options" type="application/json">{
|
||||
"location": "navbar",
|
||||
"copy-button": false,
|
||||
@@ -362,6 +362,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<a href="../../docs/quantize.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quantization with torchao</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/optimizations.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Optimizations Guide</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
@@ -613,7 +619,7 @@ Tip
|
||||
</section>
|
||||
<section id="pre-training-without-streaming" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="pre-training-without-streaming">Pre-training without streaming</h3>
|
||||
<p>On the rare case that the dataset is small and can be loaded entirely into memory, another approach to running pre-training is to use the <code>completion</code> format. This would mean that the entire dataset is pre-tokenized instead of on-demand in streaming.</p>
|
||||
<p>In the case that the dataset is small and can be loaded entirely into memory, another approach to running pre-training is to use the <code>completion</code> format. This would mean that the entire dataset is pre-tokenized instead of on-demand in streaming.</p>
|
||||
<p>One benefit of this is that the tokenization can be performed separately on a CPU-only machine, and then transferred to a GPU machine for training to save costs.</p>
|
||||
<p>From Hugging Face:</p>
|
||||
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="fu">datasets</span><span class="kw">:</span></span>
|
||||
|
||||
Reference in New Issue
Block a user