border border

Sensemaking Tool

Team members

Amarjyot Kaur Narula (ISTD), Joey Yeo Kailing (ISTD), Philia Neo Tong Wee (ESD), Tan Jing Kang (ESD), Tan Zi Ning (ESD), Wu Rong (ESD), Xie Han Keong (ISTD)

Instructors:

Kenny Choo, Keegan Kang

01
Project Description

Description

The analysis of content from local and international news outlets and social media platforms through sensemaking is important for generating insights in the homeland security landscape. They form part of a repertoire of data sources to help MHA officers make informed decisions to keep Singapore safe and secure. The current sensemaking process is time-consuming and tedious because the entire process is done manually. This manual process does not optimise man-hours, monetary resources, and opportunities for utilizing the data.

Mission

To design a sensemaking tool in the form of an integrated news feed and analytics dashboard. Our tool will allow MHA officers to browse online materials more efficiently and perform analysis and generate insights more effectively, saving time and increasing the quality of insights generated.

02
Architecture

Overall Architecture

The former image illustrates our overall architecture and data flow, from the scraped raw news articles, into our Natural Language Processing (NLP) tasks, which are stored in the database. And a frontend user interface displaying the results.

NLP Tasks Flow

The latter one illustrates the data flow for our NLP tasks, from the raw news articles into desired outputs, which are meant to partially assist the officers in sensemaking.

03
Backend

Frameworks

NLP Tasks

Each NLP tasks has their desired outputs as written below respectively. We have explored multiple algorithms for each and based on the evaluations conducted, have finalised on the most suitable ones.

  1. Clustering
🎯 Obtaining news articles that belong to the same event
✅ Latent Dirichlet Allocation (LDA)

  2. Summariser
🎯 Single document (Long and Short) Summary for every news article
✅ Bidirectional and Auto-Regressive Transformer (BART) & TextRank
🎯 Multiple document Summary for every event
✅ SummPip

  3. Entity Extractor
🎯 Extracting entities (eg, number of killed, weapons used) for every news article
✅ Information Extractor (IE) pipeline -- Bidirectional Encoder Representations from Transformers (BERT) finetuned on (Stanford Question Answering Dataset) SQuAD2.0

  4. Topic Classification
🎯 Categorising news articles into labels (eg, terror attack, disease)
✅ Single Binary Classifier

04
Frontend

Designing to Simplify the Users' Journey

design direction

User Interface Features

✨ Homepage has visualisation components
- Meant to assist in exploring the data with ease
- Visualisations provide a high level idea of the data for the analysts to familiarise themselves with the big picture

✨ Clean pages displaying articles in each topic and in each event
- Provides more granular details such as entities that support the analysts in understanding each individual event

✨ Article page that has convenient features such as copy to clipboard options
- Meant for easy sharing between colleagues or for note taking

🎉 We once again wish to extend our gratitude towards our industry HTX mentors -- Sylvia Liaw, Dr Terence Tan, Lim Ming En, Martyn Wong and Ong Pang Wei. We thank you for the continuous support, striving us to do better.

 

 

 

 

Industry Partner

htx logo

TEAM MEMBERS

student Amarjyot Kaur Narula Information Systems Technology and Design
student Joey Yeo Kailing Information Systems Technology and Design
student Philia Neo Tong Wee Engineering Systems and Design
student Tan Jing Kang Engineering Systems and Design
student Tan Zi Ning Engineering Systems and Design
student Wu Rong Engineering Systems and Design
student Xie Han Keong Information Systems Technology and Design
border border