PCVC Speech Phoneme Dataset دیتاست گفتار واج بنیان

If you want to use this dataset please refer to this paper:

Malekzadeh, S., Gholizadeh, M.H. Ghayoumi zadeh H., and Razavi, S.N., 2018. Persian phonemes recognition using PPNet. arXiv preprint arXiv:1812.08600.

About

This dataset is the first phoneme-based speech dataset in the entire world and also the first free Persian speech dataset to help Persian speech researchers. It is not only what you see. It is growing.

The dataset is fully available in the PCVC Speech dataset Kaggle link.

If you have any idea or time or file to help us grow this, Please Contact us at:

URL: https://smalekz.github.io

What it is

This dataset contains 23 Persian consonants and 6 vowels. The sound samples are all possible combinations of vowels and consonants (138 samples for each speaker) with a length of 30000 data samples. The sample rate of all speech samples is 48000 which means there are 48000 sound samples in every 1 second. In each sample, the sound starts with a consonant and then there is a vowel sound and at last, there is silence. length of silence is dependent on the length of the combination of consonants and vowels. For example, if the combination ends in the 20000th data sample, the rest of the 10000 samples (until 30000, the length of each sound sample) are silent.

All the sound samples are denoised with the “Adaptive noise reduction” algorithm.

Phoneme List in PCVC dataset:

Here

How to use

Each file contains just a matrix “x”.

The sign “N” (N as number) in the “Samples” directory (Like in Sample “S00012.mat”) means this sample is the Nth sample from the speaker of “S0001.mat”.

Matlab

All files are “.mat” files. “.mat” is a format for data files in MATLAB. Every file consists of a matrix with dimensions 1236*30000 in which 23 is referring to the number of consonants, 6 is referring to the number of vowels and 30000 is the length of the sound sample. order of phonemes is just like shown in Here. To use it, just open the file and tap on the “Finish” button to import the data in the workspace of MATLAB.

Python

To use “.mat” data files in Python you can use the code below to copy the matrix in the file in “aud” variable (Put your current path instead of “MyPath”). Every file consists of a matrix with dimensions 1236*30000 in which 23 is referring to the number of consonants, 6 is referring to the number of vowels and 30000 is the length of the sound sample. order of phonemes is just like shown in Here.

import scipy.io
import glob
import numpy as np


fns = glob.glob('../input/pcvcspeech/*.mat')
al = []

for i in range(len(fns)):
    mat = scipy.io.loadmat(fns[i])  
    aud=(mat['x'])
    al.append(aud)
    
al = np.array(al)
al.shape

Acknowledgement

So many thanks to those who helped us to develop the PCVC dataset especially speakers: Farideh Jabraili, Hedayat Malekzadeh, Hamed

Afjuland, Mohammad Ataeizadeh, Tahereh Salari, Alireza Aghaei, Parisa Seyfpour, Sahel Soltani, Mina Bayarash, Milad Abdollahzadeh, Sadra Malekzadeh, …

توضیحات

دیتاست تشخیص گفتار فارسی با ترکیب واج های مصوت و صامت. کدهای راهنما و توضیحات گذاشته شده. جهت راهنمایی رایگان از طریق اطلاعات تماس در منو Contact پیام دهید.

Saber Malekzadeh

Sabir Məlikzadə صابر ملک زاده