We actively use the log files that we automatically receive from our customers to improve our customer dictionaries. The log files indicate which words and names are in current use, where the words first came into use, and whether a word/name is used by numerous people, or if it is only used for a short time. These statistics also show us where improvements are required.
We know we have a high success rate with our corrections, but to raise the standard further, we have started a new project where we have been automatically adding current/relevant names each week to some of our customer dictionaries. From this pilot project we have seen the number of unknown names fall by more than 11% and a significant increase in the number of corrections. While results from search engines can show which keywords are used most frequently, we can show which words and names are relevant right now.
In April 2010, the ash cloud in Iceland was headline news, and we recorded up to nine different spellings of the name “Eyjafjallajökull”. Those of our customers participating in this project already had the name of the volcano in their dictionary when they wrote about the incident and could be sure of the correct spelling. Without these updates, it is not uncommon to see multiple spellings of a name appearing within the same publication.