I used a variety of packages for my analysis. Through the
pacman
package, I will load all those packages necessary.
The function p_load
allows R to download packages that you
have not installed yet and load them if they are not already.
::p_load(tidyverse, ggrepel, lubridate, ggpubr, quanteda, DT, plotly,
pacman
quanteda.textplots, quanteda.textstats, quanteda.textmodels,
grid, gridExtra, ldatuning, stm, tidytext, seededlda, caret, glmnet, tibble, kableExtra, caretEnsemble, ranger, rtweet)
The main actor I chose to analyse is Generation Islam (short GI). They can be described as political activists mainly but not only occupied with online political commentary addressing topics affecting Muslims in Germany and worldwide. This group advertises the (re-)establishment of an Islamic caliphate as a solution to Muslim grievances world wide. In comparison I also looked at:
I retrieved the data in April 2022 using RTweet
. By
utilizing the (free) Twitter Sandbox API, I gained access to the latest
3000 tweets for each account. Obtaining the data with RTweet is
straightforward and can be achieved as follows.
<- get_timeline(user = "twitter_handle", n=3000) account
Since I have the data already laying around, I will load them into my environment and append them. You can get the data in the corresponding Github repository.
## Generation Islam
<- readRDS("genislam.RDS")
genislam
## ZMD
<- readRDS("zmd")
zmd
## All Actors
<- bind_rows(genislam,
all_islam
zmd,readRDS("alhm"),
readRDS("ditib"),
readRDS("igmggenclik"),
readRDS("islamratbrd"))
Let’s create a descriptive table that displays the number of tweets accessed and the corresponding time span
%>%
all_islam group_by(screen_name) %>%
summarise(n = n(), `First Tweet` = min(created_at), `Last Tweet` = max(created_at))
## # A tibble: 6 × 4
## screen_name n `First Tweet` `Last Tweet`
## <chr> <int> <dttm> <dttm>
## 1 Alhambra_eV 1335 2017-10-26 17:16:27 2022-04-27 09:46:43
## 2 DITIBkoln 978 2012-02-28 12:29:26 2022-04-27 22:57:31
## 3 Islamratbrd 1246 2013-09-27 19:02:21 2022-04-27 09:52:15
## 4 der_zmd 1534 2016-05-16 16:02:12 2022-04-22 12:42:30
## 5 genislam1 3132 2019-07-22 16:33:18 2022-04-19 13:59:34
## 6 igmggenclik 3200 2016-09-24 08:23:58 2022-04-27 13:58:20
Here we can see that the number of tweets and time span covered vary significantly. This is important considering that different political events occurred throughout and Generation Islam definitely would have commented on some of them. Hence, we are safe to say that there is a level of selection bias unfolding through the fact that the data is not expansive enough. Assuming that Generation Islam are ideologically consistent in how they interpret the world and frame political events, I suppose that the overarching narratives remain similar in varying contexts. This, and the fact that certain topics are likely to reoccur as some political conflicts remain unresolved, partly mitigates the issue of selection bias.
Let us have a look on the tweet activity of Generation Islam over the retrieved time span:
Sys.setlocale("LC_TIME", "English") # Setting time form to English
## [1] "English_United States.1252"
<- genislam
ts $tweet <- 1L
ts
<- ts %>%
ts mutate(weekly = str_c(
formatC(isoweek(created_at), format = "f", digits = 0, width = 2, flag = "0"),
"/",
str_sub(isoyear(created_at), 3, 4))) %>%
group_by(screen_name) %>%
arrange(created_at, .by_group = TRUE) %>%
mutate("cumulative" = cumsum(tweet))
<- ts %>%
ts group_by(screen_name, weekly) %>%
mutate("week_cumulative" = cumsum(tweet))
# Adding all tweets together
<- ggplot(ts) +
cumulative_tweet geom_line(aes(x=as.Date(created_at), y=cumulative), color = "darkorange") +
xlab("Date") +
ylab("Tweets (cumulative)") +
scale_x_date(date_labels = "%b %y", date_breaks = "2 month") +
theme_minimal() +
labs(title = "A") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# Adding up all tweets per week
<- ggplot(ts) +
week_tweet geom_point(aes(x=as.Date(created_at), y=week_cumulative), alpha = 0.1, color = "deeppink4") +
geom_smooth(aes(x=as.Date(created_at), y=week_cumulative), se = FALSE, color = "deeppink4",
method = "gam", formula = y ~ s(x)) +
xlab("Date") +
ylab("Tweets (weekly)") +
scale_x_date(date_labels = "%b %y", date_breaks = "2 month") +
theme_minimal() +
labs(title = "B") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# Displaying both plots together
ggarrange(cumulative_tweet, week_tweet, ncol = 1, nrow = 2, align = "v")
Examining this figure, we observe that the weekly activity of Generation Islam exhibits fluctuations. Notably, the highest point of weekly activities occurred in May 2021, which aligns with the timeframe of the 2021 Israel-Palestine crisis. This correlation suggests a potential connection between political events happening and the increased engagement of Generation Islam during that period. They are quick to react and magnify their communication when significant grievances happen to Muslim populations around the world.
In the case of the IGMG, the Twitter data of the youth organization was used as they tweet in Turkish far less than their parent organization. Certainly, youth organizations communicate differently than their parent organizations, but by including them we can capture the ideological setting of the organization as a whole without including a language that I am not able to analyze.↩︎