Sunday, February 28, 2016

Data Analysis for Topics on Twitter - #Justice4Liang #rally220

Feb 20, 2016, which was a historic moment for many people who care about the fair justice for previous NYPD officer Peter Liang. People across more than 10 major US cities rallied peacefully and got their voice heard. In this post, we want to show how the #Justice4Liang topic discussed in Twitter, which is now a self-media platform to make everyone a media reporter.

This fun work consists of three major steps:
(1) Data acquisition: to obtain tweets about  #Justice4Liang  #rally220 from Twitter;
(2) Data preparation: to parse the data into correct format;
(3) Data exploratory study.

There is potential to generate more insightful data analytics. Today, we just show some exploratory analysis results.

 Following figure shows a snapshot of the data we have collected.



The data contains the tweets about  #Justice4Liang  #rally220 from "02/22/2016 00:46:34" "02/26/2016 07:21:16" UTC time. The following figure shows the tweets trending. We can see the trend is going down. We can also monitor if the trend will go up if any organization is going to push it.




What device/application people use to publish tweets? Following figure answers this question. Using Facebook is really very small part.



We also produce a word cloud for the tweets locations. 9612 tweets, which accounts for 60.8% of the data,  have a non-null location.