How to Get Twitter Data using R (2024)

[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a previous post, we showed how to get Twitter data using Python. In this tutorial, we will show you how to get Twitter data using R and more particularly with the rtweet library. As we have explained in the previous post, you will need to create a developer account and get your consumer and access keys respectively.

Install the rtweet Library and API Authorization

We can install the rtweet library either from CRAN or from GitHub as follows:

## install rtweet from CRANinstall.packages("rtweet")## OR## install remotes package if it's not alreadyif (!requireNamespace("remotes", quietly = TRUE)) { install.packages("remotes")}## install dev version of rtweet from githubremotes::install_github("ropensci/rtweet")

Then we are ready to load the library:

## load rtweet packagelibrary(rtweet)

We can authorize the API by entering our credentials:

How to Get Twitter Data using R (1)
How to Get Twitter Data using R (2)
## load rtweetlibrary(rtweet)## load the tidyverselibrary(tidyverse)## store api keys (these are fake example values; replace with your own keys)api_key <- "afYS4vbIlPAj096E60c4W1fiK"api_secret_key <- "bI91kqnqFoNCrZFbsjAWHD4gJ91LQAhdCJXCj3yscfuULtNkuu"## authenticate via web browsertoken <- create_token( app = "rstatsjournalismresearch", consumer_key = api_key, consumer_secret = api_secret_key)get_token() 

Search Tweets

Now we are ready to get our first data from Twitter, starting with 1000 tweets containing the hashtag “#DataScience” by excluding the re-tweets. Note that the “search_tweets” returns data from the last 6-9 days

rt <- search_tweets("#DataScience", n = 1000, include_rts = FALSE)View(rt) 
How to Get Twitter Data using R (3)

Usually, we want to get the screen name, the text, the number of likes and the number of re-tweets.

View(rt%>%select(screen_name, text, favorite_count, retweet_count)) 
How to Get Twitter Data using R (4)

The rt object consists of 90 columns, so as you can understand it contains a lot of information. Note that there is a limit of 18000 tweets per 15 minutes. If we want to get more, we should do the following:

## search for 250,000 tweets containing the word datasciencert <- search_tweets( "datascience", n = 250000, retryonratelimit = TRUE) 

We can also plot the tweets:

ts_plot(rt, "hour") 
How to Get Twitter Data using R (5)

Stream Tweets

Stream tweets return a random sample of approximately 1% of the live stream of all tweets, having the option to filter the data via a search-like query, or enter specific user ids, or define the geo-coordinates of the tweets.

## stream tweets containing the toke DataScience for 30 secondsrt <- stream_tweets("DataScience", timeout = 30)View(rt) 
How to Get Twitter Data using R (6)

Note that the rt object consists of 90 columns.

Filter Tweets

When we search for tweets, we can also apply some filters. For example, let’s say that we want to specify the language or to exclude retweets and so on. Let’s search for tweets that contain the “StandWithUkraine” text by excluding retweets, quotes and replies.

ukraine <- search_tweets("StandWithUkraine-filter:retweets -filter:quote -filter:replies", n = 100)View(ukraine) 
How to Get Twitter Data using R (7)

Finally, let’s search for tweets in English that contain the word “Putin” and have more than 100 likes and retweets.

putin<-search_tweets("Putin min_retweets:100 AND min_faves:100", lang='en')View(putin) 
How to Get Twitter Data using R (8)

Get Timelines

The get_imeline() function returns up to 3,200 statuses posted to the timelines of each of one or more specified Twitter users. Let’s get the timeline of R-bloggers and Hadley Wickham accounts by taking into consideration the 500 most recent tweets of each one.

rt<-get_timeline(user=c("Rbloggers", "hadleywickham"), n=500) 

Whose tweets are more popular? Who gets more likes and retweets? Let’s answer this question.

rt%>%group_by(screen_name)%>% summarise(AvgLikes = mean(favorite_count, na.rm=TRUE), AvgRetweets = mean(retweet_count, na.rm=TRUE)) 

As we can see, the guru Hadley Wickham gets much more likes and retweets compared to R-bloggers.

# A tibble: 2 x 3 screen_name AvgLikes AvgRetweets1 hadleywickham 31.6 110. 2 Rbloggers 17.8 6.17

Get the Likes made by a User

We can get the n most recently liked tweets by a user. Let’s see what Hadley likes!

hadley <- get_favorites("hadleywickham", n = 1000) 

Which are his favorite accounts?

hadley%>%group_by(screen_name)%>%count()%>%arrange(desc(n))%>%head(5) 

And we get:

# A tibble: 5 x 2# Groups: screen_name [5] screen_name n1 djnavarro 172 ijeamaka_a 133 mjskay 124 sharlagelfand 125 vboykis 12

Which are his favorite hashtags?

hadley%>%unnest(hashtags)%>%group_by(tolower(hashtags))%>%count()%>% arrange(desc(n))%>%na.omit()%>%head(5) 

And we get:

# A tibble: 5 x 2# Groups: tolower(hashtags) [5] `tolower(hashtags)` n1 rstats 782 rtistry 223 genuary2022 94 genuary 75 rayrender 6

Search Users

We can search for users with a specific hashtag. For example:

## search for up to 1000 users using the keyword rstatsrstats <- search_users(q = "rstats", n = 1000)## data frame where each observation (row) is a different userrstats## tweets data also retrieved. can access it via tweets_data()tweets_data(rstats) 

Get Friends and Followers

We can get the user ids of the friends and followers of a specific account. Let’s get the friends and followers of the Predictive Hacks account.

get_followers("predictivehacks")get_friends("predictivehacks") 

We can extract more info on the user ids using the function:

lookup_users(users, parse = TRUE, token = NULL) 

where users is a user id or screen name of the target user.

Get Trends

We can get the trends in a particular city or country. For example:

## Retrieve available trendstrends <- trends_available()trends## Store WOEID for Worldwide trendsworldwide <- trends$woeid[grep("world", trends$name, ignore.case = TRUE)[1]]## Retrieve worldwide trends datadataww_trends <- get_trends(worldwide)## Preview trends dataww_trends## Retrieve trends data using latitude, longitude near New York Citynyc_trends <- get_trends_closest(lat = 40.7, lng = -74.0)## should be same result if lat/long supplied as first argumentnyc_trends <- get_trends_closest(c(40.7, -74.0))## Preview trends datanyc_trends## Provide a city or location name using a regular expression string to## have the function internals do the WOEID lookup/matching for you(luk <- get_trends("london")) 

Let’s see what is trending in Russia and in Ukraine, respectively.

get_trends("Russia")get_trends("Ukraine") 
How to Get Twitter Data using R (9)
How to Get Twitter Data using R (10)

Final Thoughts

Every Data Science analysis starts with the data. In this tutorial, we showed how to get Twitter data in R. In the next posts, we will show you how to analyze this data by applying sentiment analysis, topic modelling, network analysis and so on. Stay tuned!

Related

To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

How to Get Twitter Data using R (2024)
Top Articles
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated:

Views: 5607

Rating: 4.3 / 5 (64 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.