Remove scipy (#139)

* Remove scipy * No longer need to replace T * Update README.md * Remove numpy version lock * Update README.md * Update uv.lock
2025-03-18 11:16:34 -07:00
parent 3f9dd88d6f
commit e43d62643e
7 changed files with 62 additions and 71 deletions
--- a/README.md
+++ b/README.md
@@ -5,10 +5,31 @@ An inference library for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M)
 > **Kokoro** is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.

 ### Usage
-You can run this cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/SAMPLES.md).
+You can run this basic cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/SAMPLES.md).
+```py
+!pip install -q kokoro>=0.9.2 soundfile
+!apt-get -qq -y install espeak-ng > /dev/null 2>&1
+from kokoro import KPipeline
+from IPython.display import display, Audio
+import soundfile as sf
+import torch
+pipeline = KPipeline(lang_code='a')
+text = '''
+[Kokoro](/kˈOkəɹO/) is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, [Kokoro](/kˈOkəɹO/) can be deployed anywhere from production environments to personal projects.
+'''
+generator = pipeline(text, voice='af_heart')
+for i, (gs, ps, audio) in enumerate(generator):
+    print(i, gs, ps)
+    display(Audio(data=audio, rate=24000, autoplay=i==0))
+    sf.write(f'{i}.wav', audio, 24000)
+```
+Under the hood, `kokoro` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki
+
+### Advanced Usage
+You can run this advanced cell on [Google Colab](https://colab.research.google.com/).
 ```py
 # 1️⃣ Install kokoro
-!pip install -q kokoro>=0.8.4 soundfile
+!pip install -q kokoro>=0.9.2 soundfile
 # 2️⃣ Install espeak, used for English OOD fallback and some non-English languages
 !apt-get -qq -y install espeak-ng > /dev/null 2>&1
 # 🇪🇸 'e' => Spanish es
@@ -50,13 +71,12 @@ generator = pipeline(
    text, voice='af_heart', # <= change voice here
    speed=1, split_pattern=r'\n+'
 )
-
 # Alternatively, load voice tensor directly:
-voice_tensor = torch.load('path/to/voice.pt', weights_only=True)
-generator = pipeline(
-    text, voice=voice_tensor,
-    speed=1, split_pattern=r'\n+'
-)
+# voice_tensor = torch.load('path/to/voice.pt', weights_only=True)
+# generator = pipeline(
+#     text, voice=voice_tensor,
+#     speed=1, split_pattern=r'\n+'
+# )

 for i, (gs, ps, audio) in enumerate(generator):
    print(i)  # i => index
@@ -66,10 +86,7 @@ for i, (gs, ps, audio) in enumerate(generator):
    sf.write(f'{i}.wav', audio, 24000) # save each audio file
 ```

-Under the hood, `kokoro` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki
-
 ### Conda Environment
-
 Use the following conda `environment.yml` if you're facing any dependency issues.
 ```yaml
 name: kokoro
@@ -85,7 +102,6 @@ dependencies:
 ```

 ### Acknowledgements
-
 - 🛠️ [@yl4579](https://huggingface.co/yl4579) for architecting StyleTTS 2.
 - 🏆 [@Pendrokar](https://huggingface.co/Pendrokar) for adding Kokoro as a contender in the TTS Spaces Arena.
 - 📊 Thank you to everyone who contributed synthetic training data.