Repository navigation

Whisper

Created by OpenAI

发布于 August 2021

Repository: openai/whisper
Website: openai.com
Wikipedia

相关主题

Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

openai speech-to-text transformer Whisper inference speech-recognition

C++

43624

4775

3 天前

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

深度学习 inference quantization speech-recognition speech-to-text transformer Whisper openai

Python

18394

1519

2 个月前

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text Whisper

Python

18004

1892

2 天前

chidiwilliams / buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Whisper

Python

15393

1148

13 小时前

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

conformer PyTorch speech-recognition paraformer punctuation speaker-diarization rnnt audio-visual-speech-recognition pretrained-model voice-activity-detection Whisper dfsmn vad speechgpt speechllm

Python

12897

1307

4 天前

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

transformer conformer speech-translation streaming-asr speech-alignment punctuation-restoration streaming-tts speech-synthesis tts asr speech-recognition voice-cloning vocoder voice-recognition self-supervised-learning Whisper

Python

12266

1938

7 天前

niedev / RTranslator

Open source real-time translation app for Android that runs locally

translator bluetooth-le realtime-translator Android onnx onnxruntime sentencepiece transformers translation nllb Whisper mobile-app offline

C++

9217

826

1 个月前

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

ggml PyTorch chatglm 部署 flan-t5 大语言模型 wizardlm 人工智能机器学习 Whisper inference openai-api mistral gemma llama llamacpp vllm qwen llama3 glm4

Python

8595

744

4 天前

Zackriya-Solutions / meeting-minutes

A free and open source, self hosted Ai based live meeting note taker and minutes summary generator that can completely run in your Local device (Mac OS and windows OS Support added. Working on adding linux support soon) https://meetily.zackriya.com/ is meetly ai

meeting-minutes meeting-notes recorder 自动化 cross-platform Linux 大语言模型 macOS Windows Rust Whisper whisper-cpp 人工智能 live transcript transcription

C++

7645

594

1 天前

argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon

inference iOS speech-recognition Swift Whisper transformers macOS visionOS watchOS

Swift

5094

454

1 天前

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text Whisper

Jupyter Notebook

4999

466

2 个月前

abus-aikorea / voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.