A team of Saudi Arabian researchers says they are developing software that can spot “fake news” before humans. The key, according to a newly published study, is a hybrid approach that measures both Twitter users’ reputations, as well as the tone of their tweets.
Most online tools designed to measure credibility rely solely on user reputation to identify users who are more likely to spread misinformation. The team from King Saud University, however, also incorporated specific text features of tweets into the equation. Initial results showed the method to be accurate 90 percent of the time when trying to identify non-credible sources through Twitter authentication.
“This system is capable of profiling individuals who constantly post malicious information on the internet,” said Muhammad Al-Qurishi, one of the principal researchers.
The team’s approach consists of weighting Twitter user features into three categories – the tweet level, the user level and the hybrid level.
- Tweet level: The tweet is divided into text features, including the content posted, the hashtags used and the sentimental features, all of which are used to identify positive and negative sentiment.
- User level: This consists of the user’s profile, friends and other associations.
- Hybrid level: This involves the aggregation of the tweet-based features with the sentimental score.
The information is then run through a feature-ranking algorithm (depicted below).
A few of the study’s interesting findings include:
- Non-credible tweets to tend to:
- have fewer characters per tweet than credible tweets
- have at least one question mark
- have more hashtags than credible Tweets
- be authored by users who have fewer followers than credible sources
- By contrast, credible tweets tend to:
- have more user mentions than non-credible tweets
- be more positive than non-credible tweets
- have no exclamation marks (almost 99%)
Measuring user reputation is an important aspect because the phenomenon of inspiration is widespread, especially on social networks. Researchers believe that when people discuss a topic related to a sensitive event, they are subject to influences that affect what they post. The reputation-based technique helps to filter neglected information before starting the assessment process.
To prove the applicability of this system, an experiment was done on a campaign against Houthi rebels in Yemen. Over one million tweets from approximately 500,000 Twitter accounts were used. Based on the results, the accuracy of this model in determining the credibility was approximately 90 percent.
From this study, researchers have found that it is possible to take both user associations as well as tweet content into consideration when determining credibility on Twitter. While it still relies on human input during setup, it is a useful way to automate such assessments.
What’s next for this credibility assessment system? After successfully identifying malicious messages or fake news, the next step is to spread it worldwide to help curb the spread of misinformation and hopefully, remove the perpetrators.
For more information on Twitter authentication, visit IEEE Xplore.