Joachim Breitner

The diameter of German+English

Published 2018-05-23 in sections English, Digital World.

Languages never map directly onto each other. The English word fresh can mean frisch or frech, but frish can also be cool. Jumping from one words to another like this yields entertaining sequences that take you to completely different things. Here is one I came up with:

frechfreshfrishcoolabweisenddismissivewegwerfendtrashingverhauendbangingGeklopfeknocking – …

And I could go on … but how far? So here is a little experiment I ran:

  1. I obtained a German-English dictionary. Conveniently, after registration, you can get dict.cc’s translation file, which is simply a text file with three columns: German, English, Word form.

  2. I wrote a program that takes these words and first canonicalizes them a bit: Removing attributes like [ugs.] [regional], {f}, the to in front of verbs and other embellishment.

  3. I created the undirected, bipartite graph of all these words. This is a pretty big graph – ~750k words in each language, a million edges. A path in this graph is precisely a sequence like the one above.

  4. In this graph, I tried to find a diameter. The diameter of a graph is the longest path between two nodes that you cannot connect with a shorter path.

Because the graph is big (and my code maybe not fully optimized), it ran a few hours, but here it is: The English expression be annoyed by sb. and the German noun Icterus are related by 55 translations. Here is the full list:

  • be annoyed by sb.
  • durch jdn. verärgert sein
  • be vexed with sb.
  • auf jdn. böse sein
  • be angry with sb.
  • jdm. böse sein
  • have a grudge against sb.
  • jdm. grollen
  • bear sb. a grudge
  • jdm. etw. nachtragen
  • hold sth. against sb.
  • jdm. etw. anlasten
  • charge sb. with sth.
  • jdn. mit etw. [Dat.] betrauen
  • entrust sb. with sth.
  • jdm. etw. anvertrauen
  • entrust sth. to sb.
  • jdm. etw. befehlen
  • tell sb. to do sth.
  • jdn. etw. heißen
  • call sb. names
  • jdn. beschimpfen
  • abuse sb.
  • jdn. traktieren
  • pester sb.
  • jdn. belästigen
  • accost sb.
  • jdn. ansprechen
  • address oneself to sb.
  • sich an jdn. wenden
  • approach
  • erreichen
  • hit
  • Treffer
  • direct hit
  • Volltreffer
  • bullseye
  • Hahnenfuß-ähnlicher Wassernabel
  • pennywort
  • Mauer-Zimbelkraut
  • Aaron’s beard
  • Großkelchiges Johanniskraut
  • Jerusalem star
  • Austernpflanze
  • goatsbeard
  • Geißbart
  • goatee
  • Ziegenbart
  • buckhorn plantain
  • Breitwegerich / Breit-Wegerich
  • birdseed
  • Acker-Senf / Ackersenf
  • yellows
  • Gelbsucht
  • icterus
  • Icterus

Pretty neat!

So what next?

I could try to obtain an even longer chain by forgetting whether a word is English or German (and lower-casing everything), thus allowing wild jumps like hathuthüttelodge.

Or write a tool where you can enter two arbitrary words and it finds such a path between them, if there exists one. Unfortunately, it seems that the terms of the dict.cc data dump would not allow me to create such a tool as a web site (but maybe I can ask).

Or I could throw in additional languages!

What would you do?

Update (2018-06-17):

I ran the code again, this time lower-casing all words, and allowing false-friends translations, as suggested above. The resulting graph has – surprisingly – precisely the same diamater (55), but with a partly different list:

  • peyote
  • peyote
  • mescal
  • meskal
  • mezcal
  • blaue agave
  • maguey
  • amerikanische agave
  • american agave
  • jahrhundertpflanze
  • century plant
  • fächerlilie
  • tumbleweed
  • weißer fuchsschwanz
  • common tumbleweed
  • acker-fuchsschwanz / ackerfuchsschwanz
  • rough pigweed
  • grünähriger fuchsschwanz
  • love-lies-bleeding
  • stiefmütterchen
  • pansy
  • schwuchtel
  • fruit
  • ertrag
  • gain
  • vorgehen
  • approach
  • sich an jdn. wenden
  • address oneself to sb.
  • jdn. ansprechen
  • accost sb.
  • jdn. belästigen
  • pester sb.
  • jdn. traktieren
  • abuse sb.
  • jdn. beschimpfen
  • call sb. names
  • jdn. etw. heißen
  • tell sb. to do sth.
  • jdm. etw. befehlen
  • entrust sth. to sb.
  • jdm. etw. anvertrauen
  • entrust sb. with sth.
  • jdn. mit etw. [dat.] betrauen
  • charge sb. with sth.
  • jdm. etw. zur last legen
  • hold sth. against sb.
  • jdm. etw. nachtragen
  • bear sb. a grudge
  • jdm. grollen
  • have a grudge against sb.
  • jdm. böse sein
  • be angry with sb.
  • auf jdn. böse sein
  • be mad at sb.
  • auf jdn. einen (dicken) hals haben

Note that there is not actually a false-friend in this list – it seems that adding the edges just changed the order of edges in the graph representation and my code just happened to find a different diamer.

Comments

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.