From 4890c81c12a08064102bc8413631d18558bb75d1 Mon Sep 17 00:00:00 2001 From: NanoCode012 Date: Mon, 21 Jul 2025 18:17:50 +0700 Subject: [PATCH] feat(doc): add notes for audio loading --- docs/multimodal.qmd | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/multimodal.qmd b/docs/multimodal.qmd index df12f6e68..2be3304d8 100644 --- a/docs/multimodal.qmd +++ b/docs/multimodal.qmd @@ -158,6 +158,12 @@ For audio loading, you can use the following keys within `content` alongside `"t - `"url": "https://example.com/audio.mp3"` - `"audio": np.ndarray` +::: {.callout-tip} + +You may need to install `librosa` via `pip install librosa`. + +::: + ### Example Here is an example of a multi-modal dataset: @@ -188,3 +194,9 @@ Here is an example of a multi-modal dataset: } ] ``` + +## FAQ + +1. `PIL.UnidentifiedImageError: cannot identify image file ...` + +`PIL` could not retrieve the file at `url` using `requests`. Please check for typo. One alternative reason is that the request is blocked by the server.