BEDLAN is committed to making the data used for our analyses freely available to the academic community, so that our findings can replicated and extended.


UraLex basic vocabulary dataset (v1.0)

UraLex is a dataset consisting of lexical reflexes of 313 meanings from 26 Uralic languages. Most of the meanings originate from standardized basic vocabulary lists. The lexical reflexes are accompanied by multistate characters that represent their historical relationships.

Cite the dataset as:

Syrjänen, Kaj, Lehtinen, Jyri, Vesakoski, Outi, de Heer, Mervi, Suutari, Toni, Dunn, Michael, Määttä, Urho, Leino, Unni-Päivä. (2018). lexibank/uralex: UraLex basic vocabulary dataset. DOI:10.5281/zenodo.1459402

Digital dialect atlas of Finnish

Lauri Kettunen’s Dialect Atlas of Finnish from the 1940s includes 213 pages of dialectal features describing variation within the Finnish language. The atlas was originally digitized from its book format by the Finnish Dialect Atlas project, led by Sheila Embleton and Eric S. Wheeler and funded by the Social Sciences and Humanities Research Council of Canada. This data was checked for errors and converted into its current format by the BEDLAN research project. The atlas is available online as part of the Kotus Language Atlas.


In the future we intend to release the following datasets: