How to Install OpenAI Whisper (Win, Mac, Linux, Ubuntu) : Chris
by: Chris
blow post content copied from Be on the Right Side of Change
click here to view original post
Run pip3 install openai-whisper
in your command line. Once installed, use Whisper to transcribe audio files.
pip install openai-whisper
Alternatively, you may use any of the following commands to install openai
, depending on your concrete environment (Linux, Ubuntu, Windows, macOS). One is likely to work!
If you have only one version of Python installed:pip install openai-whisper
If you have Python 3 (and, possibly, other versions) installed:pip3 install
If you don't have PIP or it doesn't workopenai-whisper
python -m pip install openai-whisper python3 -m pip install openai-whisper
If you have Linux and you need to fix permissions (any one):sudo pip3 install openai-whisper pip3 install openai-whisper --user
If you have Linux with aptsudo apt install openai-whisper
If you have Windows and you have set up thepy
aliaspy -m pip install openai-whisper
If you have Anacondaconda install -c anaconda openai-whisper
If you have Jupyter Notebook!pip install openai-whisper
!pip3 install openai-whisper
With Upgrade Installation Routine
Upgrade pip and install the openai library using the following two commands, one after the other:
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade openai-whisper
Here’s the code for copy&pasting:
python3 -m pip install --upgrade pip python3 -m pip install --upgrade openai-whisper
Detailed Instructions
The codebase is compatible with Python versions 3.8 to 3.11 and recent PyTorch releases. Key dependencies include OpenAI’s ‘tiktoken
‘ for fast tokenization. To install or update to the latest release of Whisper, use:
pip install -U openai-whisper
For the latest repository version and dependencies, use:
pip install git+https://github.com/openai/whisper.git
To update to the repository’s latest version without dependencies:
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
FFmpeg, a command-line tool, is also required and can be installed via various package managers:
- For Ubuntu or Debian:
sudo apt update && sudo apt install ffmpeg
- For Arch Linux:
sudo pacman -S ffmpeg
- For MacOS with Homebrew:
brew install ffmpeg
- For Windows with Chocolatey:
choco install ffmpeg
- For Windows with Scoop:
scoop install ffmpeg
If ‘tiktoken
‘ lacks a pre-built wheel for your platform, installing Rust may be necessary. In case of installation errors, follow the Rust development environment setup and adjust the PATH environment variable as needed. If encountering 'No module named setuptools_rust'
, install it via pip install setuptools-rust
.
Whisper Models
Whisper offers five model sizes, from ‘tiny’ to ‘large’, with English-only versions available for four sizes. These models vary in memory requirements, speed, and accuracy. English-only models (‘.en’) generally perform better, especially the ‘tiny.en’ and ‘base.en’ versions.
Performance varies by language with WER (word error rate) and CER (character error rate) metrics:
How to Transcribe Audio with Whisper?
For command-line usage, Whisper can transcribe audio files using different models:
whisper audio.flac audio.mp3 audio.wav --model medium
The default setting is suitable for English. Non-English speech transcription and translation into English are also supported:
whisper japanese.wav --language Japanese --task translate
Use whisper --help
to view all options. Available languages are listed in tokenizer.py
:
Python Usage (Transcription) with Whisper
In Python, transcription can be performed with:
import whisper model = whisper.load_model("base") result = model.transcribe("audio.mp3") print(result["text"])
This process involves a 30-second sliding window for sequence-to-sequence predictions. The whisper.detect_language()
and whisper.decode()
functions offer lower-level access:
import whisper model = whisper.load_model("base") audio = whisper.pad_or_trim(whisper.load_audio("audio.mp3")) mel = whisper.log_mel_spectrogram(audio).to(model.device) _, probs = model.detect_language(mel) print(f"Detected language: {max(probs, key=probs.get)}") options = whisper.DecodingOptions() result = whisper.decode(model, mel, options) print(result.text)
If you want to master Whisper, check out our full prompt engineering mastery course teaching you the ins and outs of speech recognition in Python on the Finxter Academy:
Full Course: OpenAI Whisper – Building Cutting-Edge Python Apps with OpenAI Whisper
Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:
[Academy] Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper
January 28, 2024 at 01:37AM
Click here for more details...
=============================
The original post is available in Be on the Right Side of Change by Chris
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================
Post a Comment