Odniesienie reprezentacji znaczeń w dużych modelach językowych do klasycznych teorii przestrzeni semantycznych

Kotlewski, Rafał

Praca licencjacka

Licencja

Dostęp zamknięty

Statystyki

Odniesienie reprezentacji znaczeń w dużych modelach językowych do klasycznych teorii przestrzeni semantycznych

Autor

Kotlewski Rafał

Promotor

Nowak Andrzej

Data publikacji

2025

Abstrakt (PL)

W niniejszej pracy skoncentrowano się na analizie potencjalnych podobieństw i różnic w sposobie reprezentowania znaczeń w ludzkim umyśle oraz w dużych modelach językowych (LLM). Jako teoretyczny model opisujący organizację znaczeń przyjęto koncepcję przestrzeni semantycznej. W badaniu empirycznym utworzono takie przestrzenie osobno dla ludzi i modeli LLM. Uczestnicy oceniali podobieństwa pomiędzy dziesięcioma dyscyplinami sportowymi. Analogiczne dane uzyskano, prosząc modele językowe o ocenę tych samych par pojęć. Na podstawie tak uzyskanych macierzy podobieństw, metodą INDSCAL, odtworzono niskowymiarowe przestrzenie semantyczne dla obu systemów. Analiza wykazała wysoką zgodność strukturalną obu przestrzeni. Poszczególne wymiary uzyskane dla modeli LLM korelowały z odpowiadającymi im wymiarami ludzkimi. Wyniki te sugerują, że mimo fundamentalnie odmiennej natury uczenia (multimodalne doświadczenie vs analiza tekstu), LLM-y są w stanie konstruować wewnętrzne reprezentacje znaczeń zbliżone do tych, które kształtują się w ludzkim systemie poznawczym.

Abstrakt (EN)

This study focuses on analyzing the potential similarities and differences in how meanings are represented in the human mind and in large language models (LLMs). The theoretical framework adopted to describe the organization of meanings is the concept of a semantic space.

In the empirical part of the study, separate semantic spaces were constructed for humans and LLMs. Human participants rated the similarities between ten different sports disciplines. Corresponding data were obtained by asking language models to assess the same pairs of concepts. Based on the resulting similarity matrices, low-dimensional semantic spaces for both systems were reconstructed using the INDSCAL method.

The analysis revealed a high degree of structural alignment between the two spaces. The individual dimensions derived from the LLMs correlated with the corresponding human-derived dimensions. These results suggest that, despite their fundamentally different learning processes (multimodal experience vs. textual analysis), LLMs are capable of constructing internal representations of meaning that resemble those formed in the human cognitive system.

Słowa kluczowe PL

sztuczna inteligencja

przestrzeń semantyczna

przetwarzanie języka naturalnego

duże modele językowe

Inny tytuł

Referencing the representation of meanings in large models to classical theories of semantic spaces

Wydawca

Uniwersytet Warszawski

Data obrony

2025-07-10

Licencja otwartego dostępu