We at Combine saw this situation as an opportunity to build a tool that uses machine learning to break down the language barriers to official information. Using the Google’s Universal Sentence Encoder (GUSE), we took advantage of a pre-trained model that has been exposed to texts in 16 different languages. It is also able to perform semantic retrieval, which looks for the closest semantic match (closest in meaning) to another query from a list of known queries. It uses both a CNN and a BERT model to achieve contextual embeddings that help us deal with more complex queries. More on the technical details can be found at https://arxiv.org/pdf/1907.04307.pdf.
Using GUSE, we were able to train a model using the questions and answers found on the website of the Swedish Public Health Agency (Folkhälsomyndigheten). We then created a simple UI that allows anyone to write a query in one of the supported languages and also choose the language in which they would like to read the recommendations. See below for an example of the question “Can I go to work tomorrow?” in Arabic.
Source: c19swe.info
Since this model is trained with only around 100 questions, we cannot expect perfect results, but it still performs reasonably well, especially on shorter queries that use targeted keywords such as “vaccine” and “symptoms”. As we continue to add more data, the performance should become more and more reliable.
The live version of our web application can be found at http://c19swe.info. We hope that this will provide a useful service to those looking for information on official recommendations in other languages that reflect Sweden’s linguistic diversity so that we can all work together to flatten the curve.