Abstract
Labeled data is widely used in various classification tasks. However, there is a huge challenge that labels are often added artificially. Wrong labels added by malicious users will affect the training effect of the model. The unreliability of labeled data has hindered the research. In order to solve the above problems, we propose a framework of Label Noise Filtering and Missing Label Supplement (LNFS). And we take location labels in Location-Based Social Networks (LBSN) as an example to implement our framework. For the problem of label noise filtering, we first use FastText to transform the restaurant's labels into vectors, and then based on the assumption that the label most similar to all other labels in the location is most representative. We use cosine similarity to judge and select the most representative label. For the problem of label missing, we use simple common word similarity to judge the similarity of users' comments, and then use the label of the similar restaurant to supplement the missing labels. To optimize the performance of the model, we introduce game theory into our model to simulate the game between the malicious users and the model to improve the reliability of the model. Finally, a case study is given to illustrate the effectiveness and reliability of LNFS.
Original language | English |
---|---|
Pages (from-to) | 887-895 |
Number of pages | 9 |
Journal | Digital Communications and Networks |
Volume | 9 |
Issue number | 4 |
DOIs | |
Publication status | Published - Aug 2023 |
Externally published | Yes |
Keywords
- Cosine similarity
- FastText
- Game theory
- Label noise
- LSTM
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications