UTAR Institutional Repository

A novel approach to detect fake news using data from google search specifically on recent and popular topics

Pang, Huey Jing (2021) A novel approach to detect fake news using data from google search specifically on recent and popular topics. Final Year Project, UTAR.

Download (2606Kb) | Preview


    From the beginning of the coronavirus pandemic, the internet has become an important source of health information to the public worldwide. Anyhow, there have been widespread concerns that the novel coronavirus had caused a pandemic search for information with broad dissemination of false or misleading health information across social media. Therefore, the fact that all the online information being published is subjective to be clean and trustable are denied. Social media platforms are meant to share information and the speed of spreading fake news is unpredictable. Hence, a novel approach is used to detect fake news using data scraped from Google search specifically on recent and popular topics. This research model associating with few criteria identified throughout the research are viewed as the methods or steps on how a human being classify the real and fake news in their life. Therefore, by utilizing the criteria which are checking the source, date dispersion of articles, and the accuracy of search result, this research model can act as a current issue-related news checker that allows the public to filter out fake news published across the internet. The data is utilized in this research are scaped from Google, a search engine that allows the public to get worldwide information. Anyhow, this research will be specifically focusing on recent and popular topics for example the news regarding covid-19 that threaten the world recently. In consequence, different searching queries related to the recent and popular topics are used to scrape results from the Google search engine. The motivation behind this paper is to evaluate the criteria that could help in classifying fake news spreading across the internet. With the help of ensemble learning and the three criteria which are the number of articles from trustable website, average date difference between articles, and the average similarity score between quires and articles title, an accuracy of 73% could be obtained on the testing data and 32% on data with noise.

    Item Type: Final Year Project / Dissertation / Thesis (Final Year Project)
    Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
    T Technology > T Technology (General)
    Divisions: Faculty of Information and Communication Technology > Bachelor of Computer Science (Honours)
    Depositing User: ML Main Library
    Date Deposited: 09 Mar 2022 20:59
    Last Modified: 09 Mar 2022 20:59
    URI: http://eprints.utar.edu.my/id/eprint/4271

    Actions (login required)

    View Item