Computational methods applied to big datasets are compelling tools for historical linguistics

Digital approaches applied to big data play an increasingly important role in the humanities. However, there is skepticism about the accuracy and potential of computational methods for historical linguistics. A key task is the identification of etymologically related words (cognates) with a common ancestor, such as stone in English and Stein in German. Up to now, cognate detection is exclusively carried out by trained historical linguists who manually examine big datasets. This could change rather sooner than later, as a recent study by Johann-Mattis List, Simon Greenhill and Russell Gray from the Max Planck Institute for the Science of Human History has now revealed: The team has tested the capacity of different computational approaches to detect cognates – with striking success rates: The best-performing method could detect word relationships with an accuracy level of nearly 90%. This result not only confirms the potential of computational methodologies in the humanities, but also opens up exciting new pathways for future research in historical linguistics and human prehistory.

Related Posts