audio to audio models

4 models · ranked by HuggingFace downloads

bigvgan_v2_22khz_80band_256x

bigvgan_v2_22khz_80band_256x has no registered pipeline_tag. It likely serves as a pretraining base or a specialized evaluation model — review the model card before use.

1,302,645 ↓ · 29 ♡

bigvgan_v2_44khz_128band_512x

bigvgan_v2_44khz_128band_512x has no registered pipeline_tag. It likely serves as a pretraining base or a specialized evaluation model — review the model card before use.

462,028 ↓ · 74 ♡

PersonaPlex-7B is NVIDIA's speech-to-speech model based on Moshi architecture, supporting real-time audio-to-audio dialog with persona conditioning. At 7B parameters it runs real-time voice conversation including listening and speaking simultaneously. License is 'other' — check NVIDIA's specific terms.

343,575 ↓ · 2,526 ♡

neucodec

NeuCodec is Neuphonic's neural audio codec designed as a speech tokenizer for TTS and voice generation pipelines. It encodes speech into discrete tokens for use with language model-based TTS architectures.

309,423 ↓ · 108 ♡

Search

audio to audio models

bigvgan_v2_22khz_80band_256x

bigvgan_v2_44khz_128band_512x

personaplex-7b-v1

neucodec