Kun Ma  Kun Ma

Attention-Based Learning of Self-Media Data for Marketing Intention Detection

Abstract
In the context of natural language processing, accuracy of intention detection is the basis for subsequent research on human-machine speech interaction. However, the problem of ambiguity in word vectors reduces the accuracy of intent detection. Meantime, there is a disconnection between local features and global features as well, resulting in text feature extraction that cannot fully reflect semantic information. These issues are all barriers of intention detection. Therefore, this paper proposes an attention-based convolutional neural network for self-media data learning (called A-CNN) for marketing intention. We cascade the traditional CNN with the self-attention model in the Attention networks to form a new network structure called A-CNN, and put forward a fast feature extraction method based on skip-gram-based learning called FSLText, to represent the high-dimension word vectors in the A-CNN. On the premise of maintaining the advantages of the CNN, A-CNN can not only solve the problem of local and global features disconnection caused by the CNN pooling layer, but also avoid the increase of algorithm complexity. The Self-Attention mechanism in the Attention model can effectively optimize the weight of local features of the information in global features, and retain local features that are more useful for intention detection. A fast feature extraction method which is based on Skip-gram can retain the semantic and word order information of the text. The method is beneficial to the marketing intention detection. According to the experiment, our A-CNN, compared with traditional machine learning methods, can improve 12.32% accuracy. Contrast to the dual-channel CNN, the accuracy rate is improved by 9.68%, and compared with the ATT-CNN, it is improved by 9.97%. On the F1 score, the A-CNN can improve the F1 score by about 9.37% in comparison with the traditional machine learning methods, the accuracy rate is increased by 9.68% compared with the dual-channel CNN, and 9.68% in contrast with ATT-CNN. It illustrates that our A-CNN can effectively address semantic and feature selection for marketing intention detection.

Contributions
-Attention-based convolutional neural network (abbreviated as A-CNN). We have cascaded the self-attention mechanism of the Attention networks in CNN to form a new A-CNN structure. In traditional CNN, the data retained by the maximum pooling and average pooling in the pooling layer may not be useful for intent recognition. Therefore, we have added the Attention mechanism to the pooling layer. By calculating the attention distribution of the data, the input information is weighted and averaged, and then sent to the fully connected layer together with the ordinary-averaged information. This is more effective than the simple maximum pooling and average pooling in terms of retaining useful information for classification. Self-attention mechanism can capture local and global features more flexibly. As a result, the ratio of local features to global features of the information can be significantly optimized in A-CNN for intent detection. Compared with another ATT-CNN that puts the attention model before the CNN convolutional layer, our A-CNN can not only solve the syntax and semantic problems which depend on feature extraction methods, but also solve the problem of feature loss in the pooling layer by cascading the self-attention mechanism in it.
-Fast Skip-gram-based learning of word representations (abbreviated as FSLText). A feature extraction method based on skip-gram is proposed to represent high-dimension word vectors in our A-CNN, which is based on the Skip-gram model of word2vec. For each word, it is divided into n-gram characters to represent. It not only takes into account the word order, but also solves the problem of out of vocabulary words. Therefore, we can still construct their word vectors for words outside the training vocabulary table. Considering local word order, our FSLText word vector allows the A-CNN to have a better recognition effect in the face of newly derived words than using ordinary word vectors.

Code & Data

The data set of SOHU content algorithm contest was used in the experiment.

The data set includes the text content of the news and the tags of the news. The label indicates: 0: no marketing intention, 1: part of the text has marketing intention, 2: the whole news has marketing intention.

Data Resources
SOHU Competition

Code & Data
Data: https://pan.baidu.com/s/14YPc0_gt2FAsRCztOAEU9w Code:EAAI
Code: https://pan.baidu.com/s/1h8IzCvnsoAvxEdnT8J3zew Code:EAAI

Cite

Publication

BiBTeX