That's it, the code is written, just missing a girlfriend.
The final result is to monitor a newly posted Weibo by a Weibo user. If it is determined to be a negative emotion, a warning will be issued (such as a mobile notification, email notification, or automatically posting a Weibo with an image like the one above).
Project Address#
https://github.com/DIYgod/Weibo2RSS Output negative emotion Weibo in RSS format
https://github.com/DIYgod/Text2Emotion Analyze the emotion value of a sentence
Usage#
Combine the negative emotion Weibo RSS with IFTTT. The specific settings are shown in the figure below. The condition is that new content appears in the RSS, and the action is to send a Weibo notification (which can also be changed to a mobile notification or email notification, etc.).
Development Process#
Below is my development process.
1. Word Segmentation#
I couldn't do this myself, so I had to find ready-made solutions. I found the following:
Jieba Chinese Word Segmentation
Harbin Institute of Technology Language Technology Platform
Sina Cloud Chinese Word Segmentation
Except for Tencent AI, all the others are free or open source. After a simple comparison, I chose Xunfei, which is also used by Hammer Big Bang.
2. Emotion Analysis#
The key to this is the dictionary, and I also found ready-made ones:
NTUSD Chinese Sentiment Polarity Dictionary
Dalian University of Technology Emotional Vocabulary Ontology
Dalian University of Technology's ontology library annotates over 20,000 words, including information such as word type, emotion category, emotion intensity, and polarity, like this:
It looks good, so I chose this one.
After downloading the dictionary, save it as a CSV file, and then import it into the MongoDB database.
mongoimport -d emotion -c emo --type csv --headerline --file emotion.csv
3. Emotion Value Calculation#
Process the words to be analyzed into word segments, and then accumulate the emotion values of each word to obtain the emotion value of a Weibo.
There are actually many algorithmic tasks that can be done here, but for simplicity, I only did accumulation.
Then I found that the effect was very poor after writing it. The reason was that the dictionary content was too small and many words were missing, so many sentences couldn't be judged at all.
Finally, I abandoned all the above things and directly used Tencent AI's paid service...
4. Applying to Weibo#
The principle of capturing Weibo content is very simple. Sina Weibo's Weibo Show can be accessed without logging in. You can directly use Node.js to parse the page to get the Weibo content.
Then calculate the emotion value of the Weibo content and output the negative emotion Weibo as an RSS feed.
5. Monitoring#
Once it is output as an RSS feed, monitoring becomes easy. Among them, IFTTT works best. When new content is detected in the RSS feed, it can trigger actions such as sending mobile notifications, email notifications, or posting a Weibo.
That's it, the biggest problem is actually: I'm missing a girlfriend.