Wikidelta: glimpsing unique Wikipedia articles in 284 world languages

Mae’r blogiad hwn ar gael yn Gymraeg.

wikidelta-farsi

What is Wikidelta?

If you are curious about the languages and cultures of the world, follow Wikidelta on Twitter. It’s an unofficial and experimental account which is an attempt to discover what could be unique in each of the languages of Wikipedia.

The account selects a world language at random. It then posts links, one at a time, to unique Wikipedia articles in that language:

wikidelta-farsi-enghraifft

A unique article, as I define it here, is one which has no links to other language versions. In other words, there are no known translations, adaptations or other versions of that article. Every article shared has zero counterparts in any other language’s Wikipedia – at the time of tweeting.

There are 284 language versions of Wikipedia currently active, all maintained largely by volunteers like you and me who create articles according to their interest and expertise.

Every link you see shared from the Wikidelta account is an example of the potential uniqueness of a topic expressed in a particular language, usually created by a user of that language.

Surprises every day

Each link offers us a moment to recognise a contribution and topic which may have received little attention, especially outside its own language community or communities.

For some languages it’s possible to get the gist of the article using automatic machine translation.

In the above example Wikidelta has chosen to post links in Persian/Farsi. The tweet announcing this uses the endonym first (the name of the language in the language itself), followed by the English name of the language, followed by a short hashtag which gives the language code (which is also its Wikipedia subdomain).

In the example the randomly chosen link in the tweet appears to be a film, and one in the medium of Persian/Farsi. According to machine translation the title conveys something like “Yassin Castle”. Please note that this is not necessarily a recommendation of this film (which I have not seen) although I am told that are many magnificent Iranian films to reward the attention.

Poetry, literature, culture and more

What could be unique in each language’s Wikipedia?

My initial interest in the uniqueness of articles led to my creation of an automated account called UnigrywUnigryw in April 2016. This account was, and is, a forerunner of Wikidelta and is focused exclusively on articles in Cymraeg (Welsh).

Since it began examples of articles unique to Welsh from this account have included:

All of these types of article are in some way connected to Wales and its language. I would expect to see parallels with the other languages shared by Wikidelta. For instance, there is a uniqueness to any given language’s poetry so we could expect that to be regularly highlighted in Wikidelta.

But sometimes the unique articles have no obvious connection to a nation or its language – except for the fact that somebody somewhere just wanted to create an article about a particular (or peculiar!) topic.

interlanguage-link

Adding interlanguage links

Sometimes an article appears unique because no Wikipedia contributor has yet managed to add interlanguage links pointing to its counterparts in other languages.

Wikipedia is an ever growing and evolving project, so the perceived uniqueness might be caused by the lack of a small edit job.

If the meaning of the article is 100% obvious then that edit job can be accomplished by anybody, including non-fluent users, in a few seconds. This benefits not only Wikipedia but Wikidata as well.

(Here’s an example tweet for Wikipedia Gàidhlig where I have added interlanguage links to an article about a Westminster parliamentary constituency.)

Further research and development

I am just beginning to discover patterns in the output, as I examine the output of the underlying software script which powers the Wikidelta project.

For example the average article length and average number of images and other multimedia elements in an article appear to correlate with how well resourced a language may be.

I am also producing a chart of all the Wikipedia languages ordered by how ‘unique’ they are, and looking to share this another time.

In the meantime my intention is to add certain checks to Wikidelta which will be proxies for article quality, e.g. number of contributors, minimum length of article, multimedia elements and so on. At the time of writing the unique articles are chosen at random but I hope to add more to the algorithm, showcase the ‘best’ articles that each language can offer, and thereby burst our online filter bubbles in unexpected ways.

How to help / acknowledgments

I hope that you enjoy Wikidelta and that you learn something fascinating about our world today.

If you would like to help then please follow the Wikidelta account and feel free to retweet any tweets you find interesting. Additionally you may wish to do some Wikipedia editing and improvement as a result of what you see. If your language is not on the list of Wikipedias and you want to start one with some other fluent users of your language then there may be somebody else who can help.

There is potential research work to be done here so please contact me if you’d like to work together on something.

You may also translate this article into your language and re-publish it elsewhere. It’s licensed under CC-BY-SA.

Thanks to Wikimedia UK for the opportunity to share this, and to IlltudFfrancon, Rhys Wynne and Huw Waters for help with the idea.

Sidebar