Collecting Twitter Data: Converting Twitter JSON to CSV — UTF-8

Part I: Introduction | Part II: Getting Started | Part III: Using a Python Stream Listener | Part IV: Storing Tweets in MongoDB | Part V: Twitter JSON to CSV — Errors | Part VI: Twitter JSON to CSV — ASCII | Part VII: Twitter JSON to CSV — UTF-8 [current page] The main drawback […]

Collecting Twitter Data: Converting Twitter JSON to CSV — ASCII

Part I: Introduction | Part II: Getting Started | Part III: Using a Python Stream Listener | Part IV: Storing Tweets in MongoDB | Part V: Twitter JSON to CSV — Errors | Part VI: Twitter JSON to CSV — ASCII [current page] | Part VII: Twitter JSON to CSV — UTF-8 I outlined some […]

Collecting Twitter Data: Converting Twitter JSON to CSV — Possible Errors

Part I: Introduction | Part II: Getting Started | Part III: Using a Python Stream Listener | Part IV: Storing Tweets in MongoDB | Part V: Twitter JSON to CSV — Errors [current page] | Part VI: Twitter JSON to CSV — ASCII | Part VII: Twitter JSON to CSV — UTF-8 ASCII JSON-to-CSV | […]

Emoji iOS 9.1 Update — The Taco Emoji Analysis

Before I get too far I don’t actually analysis taco emojis. At least not yet. I, however, give you the tools to start parsing them from tweets, text or anything you can get into Python. This past month Apple released their iOS 9.1 and their latest OS X 10.11.1 El Capitan update. That updated included […]

Baseball Twitter Roller Coaster

Because Twitter is fun and so are graphs, I have tweet volume graphs from my Twitter scraper that collects tweets with the team-specific nicknames and Twitter handles. After a trade (or non-trade), the data can be collected and a graphical picture of the reaction can be produced. The graph represents the volume of sampled tweets […]

Twitter Sentiment — Penguins VS. Rangers Gm 4

Game 4 of the Penguins-Rangers series featured a brief overtime period that overshadowed the rest of the game as far as tweet volume goes. Rangers fans were more negative at the beginning of the game after the Penguins scored their first goal. Twitter volume picked up for both teams during the overtime period and Rangers […]

Twitter Sentiment — Penguins vs. Rangers Gm 3

Unfortunately, my Twitter scraper wasn’t looking for the most viral story of the Penguin’s loss to the Rangers. [I’m not linking to it, but it involves a columnist and the Penguins’ GM.] I was able to get general sentiment over the course of the game. There isn’t too much to analyze. There are more Rangers […]

Using New, Diverse Emojis for Analysis in Python

I haven’t been updating this site often since I’ve started to perform a similar job over at FanGraphs. All non-baseball stat work that I do will continued to be housed here. Over the past week, Apple has implemented new emojis with a focus on diversity in their iOS 8.3 and the OS X 10.10.3 update. […]

Collecting Twitter Data: Getting Started

Part I: Introduction | Part II: Getting Started [current page] | Part III: Using a Python Stream Listener | Part IV: Storing Tweets in MongoDB | Part V: Twitter JSON to CSV — Errors | Part VI: Twitter JSON to CSV — ASCII | Part VII: Twitter JSON to CSV — UTF-8 The R code […]

Collecting Twitter Data: Introduction

Part I: Introduction [current page] | Part II: Getting Started | Part III: Using a Python Stream Listener | Part IV: Storing Tweets in MongoDB | Part V: Twitter JSON to CSV — Errors | Part VI: Twitter JSON to CSV — ASCII | Part VII: Twitter JSON to CSV — UTF-8 Collecting Twitter data […]

2015 State of the Union Address — Text Analytics

I collected tweets about the 2015 State of the Union address [SOTU] in real time from 10am to 2am using the keywords [obama, state of the union, sotu, sotusocial, ernst]. The tweets were analyzed for sentiment, content, emoji, hashtags, and retweets. The graph below shows Twitter activity over the course of the night. The volume […]

2015 Steelers-Ravens Playoff Twitter Infographics

The Steelers-Ravens playoff game gave me a chance to test out a new analytics server and some of the tools I’ve been working on to make Twitter analysis easy using ad hoc Python scripts. So here goes: There were a lot of Steelers or Ravens colored emojis, black and gold hearts or buttons and the […]

The Most Popular Emoji Characters on Twitter

On Twitter, about 10% of general-topic tweets contain emoji characters, the tiny icons and emoticons, which are starting to get more attention when analyzing tweets, Facebook messages, or text messages. An emoji [] can capture an emotion or completely change the meaning of the written text. Before exploring how different emojis are used and what […]

Emoji, UTF-8, and Python

I have updated [better] code that allows for easy counting of emoji’s in string objects in Python, it can be found on my GitHub. I have a two counting classes in a mini-package loaded there. Emoji [], those ubiquitous emoticons that popped up when iPhone users found them in 2011 with iOS 5 are a […]

2014 ALWCG Twitter Graphs

The Royals and A’s had quite the entertaining 12-inning game Tuesday night. These are a few graphs I made from Twitter data. Yellow is Oakland; blue is Kansas City. The proportions of tweets between teams might be off, but I would venture to guess the Royals had much more social media activity than the A’s. […]

Twitter Retweet Decay

This uses the same data set I obtained from my NU Data Mining final project [summary]. Recently, @MLBcathedrals tweeted a photo I submitted to them: Seat at exact spot where #Twins Harmon Killebrew hit a 522′ HR. Now in Mall of America via @seandolinar: @ckamka pic.twitter.com/OluZOUVUkb” — Baseball Cathedrals (@MLBcathedrals) August 4, 2014 I got […]

Twitter Sentiment Analysis

This is a summary of a final project I did from my Introduction to Data Mining class at NU. The goal of the project was to find a business need and execute a data mining process. The general process I used is outlined here and the sentiment lexicon is found here. The lexicon is from […]