Git Product home page Git Product logo

jingjuphonemeannotation's Introduction

Jingju (Beijing opera) Phoneme Annotation

中文版

Authors: Rong Gong, Rafael Caro Repetto, Yile Yang, MTG-UPF, [email protected], [email protected]

Description

This dataset is a collection of boundary annotations of a cappella singing performed by Beijing Opera (Jingju, 京剧, wiki page) professional and amateur singers.

The boundries have been annotated in a hierarchical way. Line (phrase), syllable, phoneme singing units have been annotated to a jingju (Beijing opera) a cappella singing audio dataset.

The corresponding audio files are the a-cappella singing arias recordings, which are stereo or mono, sampled at 44.1 kHz, and stored as wav files. Due to its large size, we can’t upload the audio files here, please refer to our zenodo link: http://doi.org/10.5281/zenodo.344932

The wav files are recorded by two institutes: those file names ending with ‘qm’ are recorded by C4DM Queen Mary University of London; others file names ending with ‘upf’ or ‘lon’ are recorded by MTG-UPF. If you use this audio dataset in your work, please cite both the following publication:

Rong Gong, Rafael Caro Repetto, & Yile Yang. (2017). Jingju a cappella singing dataset [Data set]. Zenodo. http://doi.org/10.5281/zenodo.344932

D. A. A. Black, M. Li, and M. Tian, “Automatic Identification of Emotional Cues in Chinese Opera Singing,” in 13th Int. Conf. on Music Perception and Cognition (ICMPC-2014), 2014, pp. 250–255.

Details

Format: Praat TextGrid, Praat official page of the textgrid annotation

Tiers number: 5

Role-types wiki page: dan, laosheng

Annotation units for phoneme-level

1.This table shows the annotation units used in 'pinyin', 'dian', 'dianSilence' and 'details' tiers of each textgrid.

2.Chinese pinyin and X-SAMPA format are given.

3.b,p,d,t,k,j,q,x,zh,ch,sh,z,c,s initials are grouped into one representation (not a formal X-SAMPA symbol): c

4.v,N,J (X-SAMPA) are three special pronunciations which do not exist in pinyin.

Structure Pinyin[X-SAMPA]
head initials m[m], f[f], n[n], l[l], g[k], h[x], r[r\'], y[j], w[w],
{b, p, d, t, k, j, q, x, zh, ch, sh, z, c, s} - group [c]
[v], [N], [J] - special pronunciations
medial vowels i[i], u[u], ü[y]
belly simple finals a[a"], o[O], e[7], ê[E], i[i], u[u], ü[y],
i (zhi,chi,shi) [1], i (ci,ci,si) [M],
compound finals ai[aI^], ei[eI^], ao[AU^], ou[oU^]
nasal finals an[an], en[@n], in[in],
ang[AN], eng[7N], ing[iN], ong[UN]
retroflexed finals er [@][r\']
tail i[i], u[u], n[n], ng[N]

##Tier descriptions:

*1-line: line boundary, lyrics in Chinese characters

*2-pinyin: written character (syllable, in pinyin) boundary not including padding characters. Silence is annotated.

*3-dian: written character (in pinyin) boundary including padding characters. Silence is annotated.

*4-dianSilence: written character (in pinyin) boundary. Silence is not annotated explicitly, it follows the previous dian syllable.

*5-details: phoneme (in X-SAMPA) boundary

##Usage: The annotation textgrid files can be opened by Praat or by our parsing code, see this jupyter notebook for some parsing examples.

##License: This textgrid annotation work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

jingjuphonemeannotation's People

Contributors

ronggong avatar sertansenturk avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.