Sept. 21st-22nd, 2022, Mykolas Romeris University’s Faculty of Human and Social Studies organized the conference, “LLOD Approaches for Language Data Research and Management (LLODREAM 2022).
During the Conference researchers from Austria and Spain made presentations including: University of Vienna Centre for Translation Studies Ass. Prof. Dr. Dagmar Gromann and Zaragoza University Department of Computer Science Senior Researcher Dr. Jorge Gracia.
The main research areas of Dr. Gromann and Dr. Gracia are linguistic-linked data creation and application in various areas using these data-based technologies. Conference presenters at LLODREAM 2022 were questioned about the importance and data of linguistic-linked open data.
-Why did you select such a research object? Why is it important and how is it useful?
Dr. Gromann: I chose this research object because I wanted to connect these two fields, that at first glance appear different – linguistics and informatics. In order to present the existing language resources in such a way that they could be read not only by a person, but also by a machine, it is useful to take advantage of linguistic-linked data.
Dr. Gracia: Being in close contact with linguists, I understand that this field is not only very interesting, but also connected with Web technologies. However, one of the main reasons I selected this field for conducting research was the aim of reducing the language barrier in Europe and over all throughout the world.
-What is the linguistic-linked data concept?
Dr. Gromann: It is difficult to define the concept of linguistic-linked data in several words. The idea is oriented towards language data presentation in such a format, which would be understood by machines. Presenting language data in an appropriate format there is the possibility to research it in new ways and new knowledge is obtained. In other words, on the Internet and elsewhere (files, archives, webpages, etc.) we have an endless amount of language data, but it is not interconnected. In order to have meaningful results, it is very important that this data be shown in some sort of common format. Linking such data in this format, during a short period of time, for example using the “Google” search engine, it is possible to obtain a lot of information.
Dr. Gracia: In order to understand this concept, it is necessary to look at everything more simply. Imagine paths that are connected with lines. Collecting and distributing data, we can obtain several varieties of information, but using linguistic-linked data, all the collected information becomes linked.
Researchers at the Conference presented and gave examples. If we are searching for information about a person, for example the Lithuanian painter and composer M.K Čiurlionis, we simply use “Google” or another Internet search engine. Then we get a bunch of pages, which contain a lot of information about his life, works of art, etc. This is not a convenient method to obtain information, because we have to press on each link and only then do we see, if the information is useful. However, if all the necessary data would be directly linked, then it would be possible to find all this information in one graphic knowledge scheme. That is a huge advantage. However, currently the Web operates on a different principle.
-What influence does linguistic-linked data have on education and other sectors?
Dr. Gromann: Linguistic-linked data technologies are useful in the education sector because they can be systematized and presented in a clear way about a specific object. Let’s say you are studying Psychology and want to find out the main fields that this area encompasses. I can look at the graphic knowledge scheme, which was created using linguistic-linked data and I will know the answer. As I mentioned earlier, this helps to acquire new knowledge. If taking interest in a new field, expanding on your knowledge because much simpler.
Dr. Gracia: I would like to add to my colleague’s answer by providing some examples. In Italy researchers working with the project, “LiLa” (Linking Latin), which studies and researches the Latin Language, come across existing differences in information: part of the information is presented in dictionaries, grammar books and some in translations – in notes. Project researchers systematize existing information and convert to an electronic format. All of this done in order that those individuals studying Latin would have the possibility to have all the necessary information in the same format within a short period of time. Other sectors have also begun to apply linguistic-linked data. The fields of labour law, pharmaceuticals, trade, etc. also aim to present systematized information.
-How does linguistic-linked data influence our decisions regarding vacation destinations, selecting an educational institution, etc.?
-Dr. Gromann and Dr. Gracia: Linguistic-linked data technologies influence our decisions presenting information in a particular manner. Because the presented information is clearly systematized, it is somewhat simpler to compare them and it doesn’t take a lot of time.
-What qualities are necessary for today‘s youth who intend to study and work in the field of language technologies (including linguistic-linked data)?
Dr. D. Gromann: In my opinion, youth should be interested not only in languages, but also in the principles of operation. For today's young person, who intends to study and work in the field of language technology, it is important to also be interested in technology, because it is an integral part.
Dr. J. Gracia: I would say that such a young person should want to work with such resources and be open to new information. When working in this field, you have to be both involved with technological and linguistic aspects. Therefore, it is very important to understand that there is a lot of new information in this field.
- What should you pay attention to when creating your image or promoting your activities? Do you have any tips on how to use linguistic-linked data in this context?
Dr. Gromann: It is important for every organization to systematize its publicly available information. Let’s say I’m and IKEA employee. As we know, every product has a specific name. If I don’t have clearly structured information, it will be extremely difficult to work. The same is true from the user’s point of view. If the information provided by an organization is misleading, it is likely that the consumer will choose another organization that provides clearer information.
Dr. Gracia: If we were to talk about application of linguistic-linked data technologies in creating one’s image or advertising activities, I would say that this type of data is too specific for these spheres. It is important to understand that linguistic-linked data is expressed verbally, which is usually not enough for businesses. Such data can help when advertising your activity or image, but we must not forget that it is also very important to have visual data here. This is true especially if we are talking about a business that offers relevant products. To make it easier to understand, I will provide an example. Imagine a website of any company. We can find semantic information there as well as links to other websites, etc. Search engines recognize some semantically annotated data on a website and can provide a more complete result. So, when you go abroad and stay in a hotel, remember that you found the price of the room on the Internet browser because this information was semantically annotated on the hotel’s website. This is an example of added-value. As I said, it’s not linguistic-linked data, but it’s broader linked data technologies. But that doesn’t mean that we can’t apply certain things for ourselves or for our business.”
Finally, researchers emphasize that this field, like any other, has certain trends. The application of linguistic-linked data technologies in various fields is particularly relevant for languages that do not have sufficient digital resources. Lithuanian is an example of such a language.
Both conference participants agree that the most important aim of this process is to reduce language barriers among the world’s population. The application of linguistic-linked data is developing very rapidly. We may not even notice when the Smart Assistant Alexa starts speaking Lithuanian.