Git Product home page Git Product logo

segbo's Introduction

SegBo: A Database of Borrowed Sounds in the World's Languages

SegBo is the first large-scale cross-linguistic database of borrowed phonological segments. It is a work in progress by:

and can be cited as:

If you use the SegBo data in your research, please cite the specific version for replicability purposes. We archive each release of SegBo in Zenodo.

DOI

SegBo data are availble in the Cross-Linguistic Data Format here:

https://github.com/cldf-datasets/segbo

This data format integrates Glottolog metadata about the languages in the SegBo sample.

Preliminary studies based on SegBo have been presented at the following conferences:

  • Elad Eisen, Eitan Grossman, Dmitry Nikolaev and Steven Moran. Defining and operationalizing `borrowability' in phonology. 5th Usage-Based Linguistics Conference (Tel Aviv, July 5-7 2021).

  • Steven Moran and Eitan Grossman. Temporal bias: a new type of bias for typologists to worry about. 5th Usage-Based Linguistics Conference (Tel Aviv, July 5-7 2021).

  • Eitan Grossman, Elad Eisen, Dmitry Nikolaev and Steven Moran. How different were phonological distributions?: The World Survey of Phonological Segment Borrowing and the Uniformitarian Assumption. Societas Linguistica Europaea 52 (Leipzig, August 2019). Slides here.

  • Eitan Grossman, Elad Eisen, Dmitry Nikolaev and Steven Moran. The typology of phonological segment borrowing. Association for Linguistic Typology 13 (Pavia, September 2019). Slides here.

  • Eitan Grossman and Steven Moran. What 'contact typologists' want from descriptive grammars. Descriptive Grammars and Typology: The Challenges of Writing Grammars of Underdescribed and Endangered Languages (Helsinki, March 2019). Slides here.

  • Eitan Grossman. Rethinking the Uniformitarian Hypothesis. Prague Linguistics (Prague, 2019).

The following published work is based on SegBo:

Several articles using SegBo are in the works, and we are working on setting up a website to make the data accessible via a GUI, so stay tuned.

segbo's People

Contributors

bambooforest avatar eitangrossman avatar eladeisen avatar macleginn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

segbo's Issues

Double check Furu

@EitanGrossman -- can you double check these entries:

https://github.com/segbo-db/segbo/blob/master/data/SegBo%20database%20-%20Phonemes.csv#L1505-L1506

[h] is also listed in the PDF, but I can't easily find these two comments in our phonemes data:

long [r] occurs in interjections

appears to affect the native realisation of /z/. Other source languages too

And no mention of whether [h] is marginal and not a borrowed segment (if so, let's mention that in the notes, since it's listed as marginal in phoible with the same ID and doculect (phoible doesn't explicitly state whether a segment is marginal and a loan, fyi)

Mismatch Round inventories

332 iwai1244 Iwaidja
338 gugu1254 Koko Bera / Gugubera
343 | kitj1240 |   | Kija/Kitja
344 | mart1256 | kart1247 | Kartujarra

We should figure out if these are updated references from AusPhon 2.0, etc.

Still some dialects in the borrowed languages

amer1254 American Spanish
braz1246 Brazilian Portuguese
clas1254 Classical Tibetan
east2344 Eastern Khams
makl1245 Maklere Papunesia
tagb1257 Tagboussikan

This doesn't allow them to have geocordinates from Glottolog.

Remove merged tables

The data are now available via CLDF, remove the merged tables and Glottolog files from this repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.