doc(en): sync translation from Chinese version

This commit is contained in:
源文雨 2024-04-21 23:03:42 +09:00
parent 3c500cf9be
commit ce2b1e153f
2 changed files with 113 additions and 102 deletions

View File

@ -23,7 +23,7 @@
> 底模使用接近50小时的开源高质量VCTK训练集训练无版权方面的顾虑请大家放心使用
> 请期待RVCv3的底模参数更大数据更大效果更好基本持平的推理速度需要训练数据量更少。
> 请期待RVCv3的底模参数更大数据更大,效果更好,基本持平的推理速度,需要训练数据量更少。
> 由于某些地区无法直连Hugging Face即使设法成功访问速度也十分缓慢特推出模型/整合包/工具的一键下载器,欢迎试用:[RVC-Models-Downloader](https://github.com/RVC-Project/RVC-Models-Downloader)

View File

@ -1,7 +1,7 @@
<div align="center">
<h1>Retrieval-based-Voice-Conversion-WebUI</h1>
An easy-to-use Voice Conversion framework based on VITS.<br><br>
An easy-to-use voice conversion framework based on VITS.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange
)](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)
@ -21,7 +21,11 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
</div>
> Check out our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
> The base model is trained using nearly 50 hours of high-quality open-source VCTK training set. Therefore, there are no copyright concerns, please feel free to use.
> Please look forward to the base model of RVCv3 with larger parameters, larger dataset, better effects, basically flat inference speed, and less training data required.
> There's a [one-click downloader](https://github.com/RVC-Project/RVC-Models-Downloader) for models/integration packages/tools. Welcome to try.
<table>
<tr>
@ -42,12 +46,6 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
</tr>
</table>
> The dataset for the pre-training model uses nearly 50 hours of high quality audio from the VCTK open source dataset.
> High quality licensed song datasets will be added to the training-set often for your use, without having to worry about copyright infringement.
> Please look forward to the pretrained base model of RVCv3, which has larger parameters, more training data, better results, unchanged inference speed, and requires less training data for training.
## Features:
+ Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval;
+ Easy + fast training, even on poor graphics cards;
@ -59,145 +57,159 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
+ AMD/Intel graphics cards acceleration supported;
+ Intel ARC graphics cards acceleration with IPEX supported.
## Preparing the environment
The following commands need to be executed with Python 3.8 or higher.
Check out our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
## Environment Configuration
### Python Version Limitation
> It is recommended to use conda to manage the Python environment.
> For the reason of the version limitation, please refer to this [bug](https://github.com/facebookresearch/fairseq/issues/5012).
(Windows/Linux)
First install the main dependencies through pip:
```bash
# Install PyTorch-related core dependencies, skip if installed
# Reference: https://pytorch.org/get-started/locally/
pip install torch torchvision torchaudio
#For Windows + Nvidia Ampere Architecture(RTX30xx), you need to specify the cuda version corresponding to pytorch according to the experience of https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/21
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
#For Linux + AMD Cards, you need to use the following pytorch versions:
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
python --version # 3.8 <= Python < 3.11
```
Then you can use poetry to install the other dependencies:
```bash
# Install the Poetry dependency management tool, skip if installed
# Reference: https://python-poetry.org/docs/#installation
curl -sSL https://install.python-poetry.org | python3 -
# Install the project dependencies
poetry install
```
You can also use pip to install them:
```bash
for Nvidia graphics cards
pip install -r requirements.txt
for AMD/Intel graphics cards on Windows (DirectML)
pip install -r requirements-dml.txt
for Intel ARC graphics cards on Linux / WSL using Python 3.10:
pip install -r requirements-ipex.txt
for AMD graphics cards on Linux (ROCm):
pip install -r requirements-amd.txt
```
------
Mac users can install dependencies via `run.sh`:
### Linux/MacOS One-click Dependency Installation & Startup Script
By executing `run.sh` in the project root directory, you can configure the `venv` virtual environment, automatically install the required dependencies, and start the main program with one click.
```bash
sh ./run.sh
```
## Preparation of other Pre-models
RVC requires other pre-models to infer and train.
### Manual Installation of Dependencies
1. Install `pytorch` and its core dependencies, skip if already installed. Refer to: https://pytorch.org/get-started/locally/
```bash
#Download all needed models from https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/
python tools/download_models.py
pip install torch torchvision torchaudio
```
2. If you are using Nvidia Ampere architecture (RTX30xx) in Windows, according to the experience of #21, you need to specify the cuda version corresponding to pytorch.
```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
```
3. Install the corresponding dependencies according to your own graphics card.
- Nvidia GPU
```bash
pip install -r requirements.txt
```
- AMD/Intel GPU
```bash
pip install -r requirements-dml.txt
```
- AMD ROCM (Linux)
```bash
pip install -r requirements-amd.txt
```
- Intel IPEX (Linux)
```bash
pip install -r requirements-ipex.txt
```
Or just download them by yourself from our [Huggingface space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/).
## Preparation of Other Files
### 1. Assets
> RVC requires some models located in the `assets` folder for inference and training.
#### Download Automatically (Default)
By default, RVC can automatically check and download the required resources when the main program starts. If the download fails, you can also choose to manually download and place them in the corresponding location.
Here's a list of Pre-models and other files that RVC needs:
```bash
./assets/hubert/hubert_base.pt
#### Download Manually
> All resource files are located in [Hugging Face space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/)
./assets/pretrained
> You can find some scripts to download them in the `tools` folder
./assets/uvr5_weights
> You can also use the [one-click downloader](https://github.com/RVC-Project/RVC-Models-Downloader) for models/integration packages/tools
Additional downloads are required if you want to test the v2 version of the model.
Below is a list that includes the names of all pre-models and other files required by RVC.
./assets/pretrained_v2
- ./assets/hubert/hubert_base.pt
```bash
rvcmd assets/hubert # RVC-Models-Downloader command
```
- ./assets/pretrained
```bash
rvcmd assets/v1 # RVC-Models-Downloader command
```
- ./assets/uvr5_weights
```bash
rvcmd assets/uvr5 # RVC-Models-Downloader command
```
If you want to use the v2 version of the model, you need to download additional resources in
If you want to test the v2 version model (the v2 version model has changed the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators), you will need to download additional features
- ./assets/pretrained_v2
```bash
rvcmd assets/v2 # RVC-Models-Downloader command
```
./assets/pretrained_v2
If you want to use the latest SOTA RMVPE vocal pitch extraction algorithm, you need to download the RMVPE weights and place them in the RVC root directory
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt
For AMD/Intel graphics cards users you need download:
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx
```
### 2. Install FFmpeg
If you have FFmpeg and FFprobe installed on your computer, you can skip this step.
#### For Ubuntu/Debian users
### 2. Install ffmpeg tool
If `ffmpeg` and `ffprobe` have already been installed, you can skip this step.
#### Ubuntu/Debian
```bash
sudo apt install ffmpeg
```
#### For MacOS users
#### MacOS
```bash
brew install ffmpeg
```
#### For Windwos users
Download these files and place them in the root folder:
#### Windows
After downloading, place it in the root directory.
```bash
rvcmd tools/ffmpeg # RVC-Models-Downloader command
```
- [ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)
- [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)
## ROCm Support for AMD graphic cards (Linux only)
To use ROCm on Linux install all required drivers as described [here](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html).
### 3. Download the required files for the rmvpe vocal pitch extraction algorithm
On Arch use pacman to install the driver:
If you want to use the latest RMVPE vocal pitch extraction algorithm, you need to download the pitch extraction model parameters and place them in the RVC root directory.
- [rmvpe.pt](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt)
```bash
rvcmd assets/rmvpe # RVC-Models-Downloader command
```
#### Download DML environment of RMVPE (optional, for AMD/Intel GPU)
- [rmvpe.onnx](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx)
```bash
rvcmd assets/rmvpe # RVC-Models-Downloader command
```
### 4. AMD ROCM (optional, Linux only)
If you want to run RVC on a Linux system based on AMD's ROCM technology, please first install the required drivers [here](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html).
If you are using Arch Linux, you can use pacman to install the required drivers.
````
pacman -S rocm-hip-sdk rocm-opencl-sdk
````
You might also need to set these environment variables (e.g. on a RX6700XT):
For some models of graphics cards, you may need to configure the following environment variables (such as: RX6700XT).
````
export ROCM_PATH=/opt/rocm
export HSA_OVERRIDE_GFX_VERSION=10.3.0
````
Make sure your user is part of the `render` and `video` group:
Also, make sure your current user is in the `render` and `video` user groups.
````
sudo usermod -aG render $USERNAME
sudo usermod -aG video $USERNAME
````
## Get started
### start up directly
Use the following command to start WebUI:
## Getting Started
### Direct Launch
Use the following command to start the WebUI.
```bash
python infer-web.py
```
### Use the integration package
Download and extract file `RVC-beta.7z`, then follow the steps below according to your system:
#### For Windows users
双击`go-web.bat`
#### For MacOS users
### Linux/MacOS
```bash
sh ./run.sh
./run.sh
```
### For Intel IPEX users (Linux Only)
### For I-card users who need to use IPEX technology (Linux only)
```bash
source /opt/intel/oneapi/setvars.sh
./run.sh
```
### Using the Integration Package (Windows Users)
Download and unzip `RVC-beta.7z`. After unzipping, double-click `go-web.bat` to start the program with one click.
```bash
rvcmd packs/general/latest # RVC-Models-Downloader command
```
## Credits
+ [ContentVec](https://github.com/auspicious3000/contentvec/)
+ [VITS](https://github.com/jaywalnut310/vits)
@ -208,9 +220,8 @@ source /opt/intel/oneapi/setvars.sh
+ [audio-slicer](https://github.com/openvpi/audio-slicer)
+ [Vocal pitch extraction:RMVPE](https://github.com/Dream-High/RMVPE)
+ The pretrained model is trained and tested by [yxlllc](https://github.com/yxlllc/RMVPE) and [RVC-Boss](https://github.com/RVC-Boss).
## Thanks to all contributors for their efforts
<a href="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Project/Retrieval-based-Voice-Conversion-WebUI" />
</a>