- Open Terminal in your desired working directory.
- Start R by typing R.
- Load the required libraries:
library(RCurl) library(streamR) library(ROAuth) library(RJSONIO) library(stringr)
- Set up for authentication. Create an app at dev.twitter.com and use your consumerkey and consumersecret in the code below:
token <- "https://api.twitter.com/oauth/request_token" access <- "https://api.twitter.com/oauth/access_token" authorize <- "https://api.twitter.com/oauth/authorize" consumerkey <- "YOUR CONSUMER KEY" consumersecret <- "YOUR CONSUMER SECRET" oauth <- OAuthFactory$new(consumerKey = consumerkey, consumerSecret = consumersecret, requestURL = token, accessURL = access, authURL = authorize)
- Now, do the actual handshake with the API. Running the code below will open up a browser where you will be provided with a PIN number to paste back into Terminal.
oauth$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl") )
- You can now save your authentication details, so that you can skip the above steps next time.
save(oauth, file = "oauth.Rdata")
- With authentication set up, we can start a data collection job from this working directory. First, load streamR and authenticate:
- The code below will initiate the data collection.
filterStream(file.name = "drinks_tweets.json", #this saves tweets into a .json file track = c("coffee", "tea"), #collects tweets that include these keywords language = "en", #collect tweets in a specific language timeout = 10800, #number is in seconds (3 hours), use 0 for permanent collection oauth = oauth) #uses the "oauth" file as your accreditation
- When done, the json can be parsed into an R dataframe:
drinks_tweets.df <- parseTweets("drinks_tweets.json", simplify = FALSE)