mirror of
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI.git
synced 2025-05-06 20:01:37 +08:00
doc(en): sync translation from Chinese version
This commit is contained in:
parent
3c500cf9be
commit
ce2b1e153f
@ -23,7 +23,7 @@
|
||||
|
||||
> 底模使用接近50小时的开源高质量VCTK训练集训练,无版权方面的顾虑,请大家放心使用
|
||||
|
||||
> 请期待RVCv3的底模,参数更大,数据更大,效果更好,基本持平的推理速度,需要训练数据量更少。
|
||||
> 请期待RVCv3的底模,参数更大,数据集更大,效果更好,基本持平的推理速度,需要训练数据量更少。
|
||||
|
||||
> 由于某些地区无法直连Hugging Face,即使设法成功访问,速度也十分缓慢,特推出模型/整合包/工具的一键下载器,欢迎试用:[RVC-Models-Downloader](https://github.com/RVC-Project/RVC-Models-Downloader)
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
<div align="center">
|
||||
|
||||
<h1>Retrieval-based-Voice-Conversion-WebUI</h1>
|
||||
An easy-to-use Voice Conversion framework based on VITS.<br><br>
|
||||
An easy-to-use voice conversion framework based on VITS.<br><br>
|
||||
|
||||
[](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)
|
||||
@ -21,7 +21,11 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
|
||||
|
||||
</div>
|
||||
|
||||
> Check out our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
|
||||
> The base model is trained using nearly 50 hours of high-quality open-source VCTK training set. Therefore, there are no copyright concerns, please feel free to use.
|
||||
|
||||
> Please look forward to the base model of RVCv3 with larger parameters, larger dataset, better effects, basically flat inference speed, and less training data required.
|
||||
|
||||
> There's a [one-click downloader](https://github.com/RVC-Project/RVC-Models-Downloader) for models/integration packages/tools. Welcome to try.
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
@ -42,12 +46,6 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
> The dataset for the pre-training model uses nearly 50 hours of high quality audio from the VCTK open source dataset.
|
||||
|
||||
> High quality licensed song datasets will be added to the training-set often for your use, without having to worry about copyright infringement.
|
||||
|
||||
> Please look forward to the pretrained base model of RVCv3, which has larger parameters, more training data, better results, unchanged inference speed, and requires less training data for training.
|
||||
|
||||
## Features:
|
||||
+ Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval;
|
||||
+ Easy + fast training, even on poor graphics cards;
|
||||
@ -59,145 +57,159 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
|
||||
+ AMD/Intel graphics cards acceleration supported;
|
||||
+ Intel ARC graphics cards acceleration with IPEX supported.
|
||||
|
||||
## Preparing the environment
|
||||
The following commands need to be executed with Python 3.8 or higher.
|
||||
Check out our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
|
||||
|
||||
## Environment Configuration
|
||||
### Python Version Limitation
|
||||
> It is recommended to use conda to manage the Python environment.
|
||||
|
||||
> For the reason of the version limitation, please refer to this [bug](https://github.com/facebookresearch/fairseq/issues/5012).
|
||||
|
||||
(Windows/Linux)
|
||||
First install the main dependencies through pip:
|
||||
```bash
|
||||
# Install PyTorch-related core dependencies, skip if installed
|
||||
# Reference: https://pytorch.org/get-started/locally/
|
||||
pip install torch torchvision torchaudio
|
||||
|
||||
#For Windows + Nvidia Ampere Architecture(RTX30xx), you need to specify the cuda version corresponding to pytorch according to the experience of https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/21
|
||||
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
|
||||
|
||||
#For Linux + AMD Cards, you need to use the following pytorch versions:
|
||||
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
|
||||
python --version # 3.8 <= Python < 3.11
|
||||
```
|
||||
|
||||
Then you can use poetry to install the other dependencies:
|
||||
```bash
|
||||
# Install the Poetry dependency management tool, skip if installed
|
||||
# Reference: https://python-poetry.org/docs/#installation
|
||||
curl -sSL https://install.python-poetry.org | python3 -
|
||||
|
||||
# Install the project dependencies
|
||||
poetry install
|
||||
```
|
||||
|
||||
You can also use pip to install them:
|
||||
```bash
|
||||
|
||||
for Nvidia graphics cards
|
||||
pip install -r requirements.txt
|
||||
|
||||
for AMD/Intel graphics cards on Windows (DirectML):
|
||||
pip install -r requirements-dml.txt
|
||||
|
||||
for Intel ARC graphics cards on Linux / WSL using Python 3.10:
|
||||
pip install -r requirements-ipex.txt
|
||||
|
||||
for AMD graphics cards on Linux (ROCm):
|
||||
pip install -r requirements-amd.txt
|
||||
```
|
||||
|
||||
------
|
||||
Mac users can install dependencies via `run.sh`:
|
||||
### Linux/MacOS One-click Dependency Installation & Startup Script
|
||||
By executing `run.sh` in the project root directory, you can configure the `venv` virtual environment, automatically install the required dependencies, and start the main program with one click.
|
||||
```bash
|
||||
sh ./run.sh
|
||||
```
|
||||
|
||||
## Preparation of other Pre-models
|
||||
RVC requires other pre-models to infer and train.
|
||||
|
||||
### Manual Installation of Dependencies
|
||||
1. Install `pytorch` and its core dependencies, skip if already installed. Refer to: https://pytorch.org/get-started/locally/
|
||||
```bash
|
||||
#Download all needed models from https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/
|
||||
python tools/download_models.py
|
||||
pip install torch torchvision torchaudio
|
||||
```
|
||||
2. If you are using Nvidia Ampere architecture (RTX30xx) in Windows, according to the experience of #21, you need to specify the cuda version corresponding to pytorch.
|
||||
```bash
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
|
||||
```
|
||||
3. Install the corresponding dependencies according to your own graphics card.
|
||||
- Nvidia GPU
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
- AMD/Intel GPU
|
||||
```bash
|
||||
pip install -r requirements-dml.txt
|
||||
```
|
||||
- AMD ROCM (Linux)
|
||||
```bash
|
||||
pip install -r requirements-amd.txt
|
||||
```
|
||||
- Intel IPEX (Linux)
|
||||
```bash
|
||||
pip install -r requirements-ipex.txt
|
||||
```
|
||||
|
||||
Or just download them by yourself from our [Huggingface space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/).
|
||||
## Preparation of Other Files
|
||||
### 1. Assets
|
||||
> RVC requires some models located in the `assets` folder for inference and training.
|
||||
#### Download Automatically (Default)
|
||||
By default, RVC can automatically check and download the required resources when the main program starts. If the download fails, you can also choose to manually download and place them in the corresponding location.
|
||||
|
||||
Here's a list of Pre-models and other files that RVC needs:
|
||||
```bash
|
||||
./assets/hubert/hubert_base.pt
|
||||
#### Download Manually
|
||||
> All resource files are located in [Hugging Face space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/)
|
||||
|
||||
./assets/pretrained
|
||||
> You can find some scripts to download them in the `tools` folder
|
||||
|
||||
./assets/uvr5_weights
|
||||
> You can also use the [one-click downloader](https://github.com/RVC-Project/RVC-Models-Downloader) for models/integration packages/tools
|
||||
|
||||
Additional downloads are required if you want to test the v2 version of the model.
|
||||
Below is a list that includes the names of all pre-models and other files required by RVC.
|
||||
|
||||
./assets/pretrained_v2
|
||||
- ./assets/hubert/hubert_base.pt
|
||||
```bash
|
||||
rvcmd assets/hubert # RVC-Models-Downloader command
|
||||
```
|
||||
- ./assets/pretrained
|
||||
```bash
|
||||
rvcmd assets/v1 # RVC-Models-Downloader command
|
||||
```
|
||||
- ./assets/uvr5_weights
|
||||
```bash
|
||||
rvcmd assets/uvr5 # RVC-Models-Downloader command
|
||||
```
|
||||
If you want to use the v2 version of the model, you need to download additional resources in
|
||||
|
||||
If you want to test the v2 version model (the v2 version model has changed the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators), you will need to download additional features
|
||||
- ./assets/pretrained_v2
|
||||
```bash
|
||||
rvcmd assets/v2 # RVC-Models-Downloader command
|
||||
```
|
||||
|
||||
./assets/pretrained_v2
|
||||
|
||||
If you want to use the latest SOTA RMVPE vocal pitch extraction algorithm, you need to download the RMVPE weights and place them in the RVC root directory
|
||||
|
||||
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt
|
||||
|
||||
For AMD/Intel graphics cards users you need download:
|
||||
|
||||
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx
|
||||
|
||||
```
|
||||
|
||||
### 2. Install FFmpeg
|
||||
If you have FFmpeg and FFprobe installed on your computer, you can skip this step.
|
||||
|
||||
#### For Ubuntu/Debian users
|
||||
### 2. Install ffmpeg tool
|
||||
If `ffmpeg` and `ffprobe` have already been installed, you can skip this step.
|
||||
#### Ubuntu/Debian
|
||||
```bash
|
||||
sudo apt install ffmpeg
|
||||
```
|
||||
#### For MacOS users
|
||||
#### MacOS
|
||||
```bash
|
||||
brew install ffmpeg
|
||||
```
|
||||
#### For Windwos users
|
||||
Download these files and place them in the root folder:
|
||||
#### Windows
|
||||
After downloading, place it in the root directory.
|
||||
```bash
|
||||
rvcmd tools/ffmpeg # RVC-Models-Downloader command
|
||||
```
|
||||
- [ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)
|
||||
|
||||
- [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)
|
||||
|
||||
## ROCm Support for AMD graphic cards (Linux only)
|
||||
To use ROCm on Linux install all required drivers as described [here](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html).
|
||||
### 3. Download the required files for the rmvpe vocal pitch extraction algorithm
|
||||
|
||||
On Arch use pacman to install the driver:
|
||||
If you want to use the latest RMVPE vocal pitch extraction algorithm, you need to download the pitch extraction model parameters and place them in the RVC root directory.
|
||||
|
||||
- [rmvpe.pt](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt)
|
||||
```bash
|
||||
rvcmd assets/rmvpe # RVC-Models-Downloader command
|
||||
```
|
||||
|
||||
#### Download DML environment of RMVPE (optional, for AMD/Intel GPU)
|
||||
|
||||
- [rmvpe.onnx](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx)
|
||||
```bash
|
||||
rvcmd assets/rmvpe # RVC-Models-Downloader command
|
||||
```
|
||||
|
||||
### 4. AMD ROCM (optional, Linux only)
|
||||
|
||||
If you want to run RVC on a Linux system based on AMD's ROCM technology, please first install the required drivers [here](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html).
|
||||
|
||||
If you are using Arch Linux, you can use pacman to install the required drivers.
|
||||
````
|
||||
pacman -S rocm-hip-sdk rocm-opencl-sdk
|
||||
````
|
||||
|
||||
You might also need to set these environment variables (e.g. on a RX6700XT):
|
||||
For some models of graphics cards, you may need to configure the following environment variables (such as: RX6700XT).
|
||||
````
|
||||
export ROCM_PATH=/opt/rocm
|
||||
export HSA_OVERRIDE_GFX_VERSION=10.3.0
|
||||
````
|
||||
Make sure your user is part of the `render` and `video` group:
|
||||
Also, make sure your current user is in the `render` and `video` user groups.
|
||||
````
|
||||
sudo usermod -aG render $USERNAME
|
||||
sudo usermod -aG video $USERNAME
|
||||
````
|
||||
|
||||
## Get started
|
||||
### start up directly
|
||||
Use the following command to start WebUI:
|
||||
## Getting Started
|
||||
### Direct Launch
|
||||
Use the following command to start the WebUI.
|
||||
```bash
|
||||
python infer-web.py
|
||||
```
|
||||
### Use the integration package
|
||||
Download and extract file `RVC-beta.7z`, then follow the steps below according to your system:
|
||||
#### For Windows users
|
||||
双击`go-web.bat`
|
||||
#### For MacOS users
|
||||
### Linux/MacOS
|
||||
```bash
|
||||
sh ./run.sh
|
||||
./run.sh
|
||||
```
|
||||
### For Intel IPEX users (Linux Only)
|
||||
### For I-card users who need to use IPEX technology (Linux only)
|
||||
```bash
|
||||
source /opt/intel/oneapi/setvars.sh
|
||||
./run.sh
|
||||
```
|
||||
### Using the Integration Package (Windows Users)
|
||||
Download and unzip `RVC-beta.7z`. After unzipping, double-click `go-web.bat` to start the program with one click.
|
||||
```bash
|
||||
rvcmd packs/general/latest # RVC-Models-Downloader command
|
||||
```
|
||||
|
||||
## Credits
|
||||
+ [ContentVec](https://github.com/auspicious3000/contentvec/)
|
||||
+ [VITS](https://github.com/jaywalnut310/vits)
|
||||
@ -208,9 +220,8 @@ source /opt/intel/oneapi/setvars.sh
|
||||
+ [audio-slicer](https://github.com/openvpi/audio-slicer)
|
||||
+ [Vocal pitch extraction:RMVPE](https://github.com/Dream-High/RMVPE)
|
||||
+ The pretrained model is trained and tested by [yxlllc](https://github.com/yxlllc/RMVPE) and [RVC-Boss](https://github.com/RVC-Boss).
|
||||
|
||||
|
||||
## Thanks to all contributors for their efforts
|
||||
<a href="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/graphs/contributors" target="_blank">
|
||||
<img src="https://contrib.rocks/image?repo=RVC-Project/Retrieval-based-Voice-Conversion-WebUI" />
|
||||
</a>
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user