Packages and Data

I used a variety of packages for my analysis. Through the pacman package, I will load all those packages necessary. The function p_load allows R to download packages that you have not installed yet and load them if they are not already.

pacman::p_load(tidyverse, ggrepel, lubridate, ggpubr, quanteda, DT, plotly,
               quanteda.textplots, quanteda.textstats, quanteda.textmodels,
               grid, gridExtra, ldatuning, stm, tidytext, seededlda,
               caret, glmnet, tibble, kableExtra, caretEnsemble, ranger, rtweet)

The main actor I chose to analyse is Generation Islam (short GI). They can be described as political activists mainly but not only occupied with online political commentary addressing topics affecting Muslims in Germany and worldwide. This group advertises the (re-)establishment of an Islamic caliphate as a solution to Muslim grievances world wide. In comparison I also looked at:

I retrieved the data in April 2022 using RTweet. By utilizing the (free) Twitter Sandbox API, I gained access to the latest 3000 tweets for each account. Obtaining the data with RTweet is straightforward and can be achieved as follows.

account <- get_timeline(user = "twitter_handle", n=3000)

Since I have the data already laying around, I will load them into my environment and append them. You can get the data in the corresponding Github repository.

## Generation Islam
genislam <- readRDS("genislam.RDS")

## ZMD
zmd <- readRDS("zmd")

## All Actors
all_islam <- bind_rows(genislam,
                       zmd,
                       readRDS("alhm"),
                       readRDS("ditib"),
                       readRDS("igmggenclik"),
                       readRDS("islamratbrd"))

Let’s create a descriptive table that displays the number of tweets accessed and the corresponding time span

all_islam %>%
  group_by(screen_name) %>%
  summarise(n = n(), `First Tweet` = min(created_at), `Last Tweet` = max(created_at))
## # A tibble: 6 × 4
##   screen_name     n `First Tweet`       `Last Tweet`       
##   <chr>       <int> <dttm>              <dttm>             
## 1 Alhambra_eV  1335 2017-10-26 17:16:27 2022-04-27 09:46:43
## 2 DITIBkoln     978 2012-02-28 12:29:26 2022-04-27 22:57:31
## 3 Islamratbrd  1246 2013-09-27 19:02:21 2022-04-27 09:52:15
## 4 der_zmd      1534 2016-05-16 16:02:12 2022-04-22 12:42:30
## 5 genislam1    3132 2019-07-22 16:33:18 2022-04-19 13:59:34
## 6 igmggenclik  3200 2016-09-24 08:23:58 2022-04-27 13:58:20

Here we can see that the number of tweets and time span covered vary significantly. This is important considering that different political events occurred throughout and Generation Islam definitely would have commented on some of them. Hence, we are safe to say that there is a level of selection bias unfolding through the fact that the data is not expansive enough. Assuming that Generation Islam are ideologically consistent in how they interpret the world and frame political events, I suppose that the overarching narratives remain similar in varying contexts. This, and the fact that certain topics are likely to reoccur as some political conflicts remain unresolved, partly mitigates the issue of selection bias.

Tweet Activity of Generation Islam

Let us have a look on the tweet activity of Generation Islam over the retrieved time span:

Sys.setlocale("LC_TIME", "English") # Setting time form to English
## [1] "English_United States.1252"
ts <- genislam
ts$tweet <- 1L

ts <- ts %>%
  mutate(weekly = str_c(
    formatC(isoweek(created_at), format = "f", digits = 0, width = 2, flag = "0"), 
    "/", 
    str_sub(isoyear(created_at), 3, 4))) %>%
  group_by(screen_name) %>%
  arrange(created_at, .by_group = TRUE) %>%
  mutate("cumulative" = cumsum(tweet))

ts <- ts %>%
  group_by(screen_name, weekly) %>%
  mutate("week_cumulative" = cumsum(tweet))

# Adding all tweets together 
cumulative_tweet <- ggplot(ts) +
  geom_line(aes(x=as.Date(created_at), y=cumulative), color = "darkorange") +
  xlab("Date") +
  ylab("Tweets (cumulative)") +
  scale_x_date(date_labels = "%b %y", date_breaks = "2 month") +
  theme_minimal() +
  labs(title = "A") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

# Adding up all tweets per week
week_tweet <- ggplot(ts) +
  geom_point(aes(x=as.Date(created_at), y=week_cumulative), alpha = 0.1, color = "deeppink4") +
  geom_smooth(aes(x=as.Date(created_at), y=week_cumulative), se = FALSE, color = "deeppink4", 
              method = "gam", formula = y ~ s(x)) +
  xlab("Date") +
  ylab("Tweets (weekly)") +
  scale_x_date(date_labels = "%b %y", date_breaks = "2 month") +
  theme_minimal() +
  labs(title = "B") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

# Displaying both plots together
ggarrange(cumulative_tweet, week_tweet, ncol = 1, nrow = 2, align = "v")

Examining this figure, we observe that the weekly activity of Generation Islam exhibits fluctuations. Notably, the highest point of weekly activities occurred in May 2021, which aligns with the timeframe of the 2021 Israel-Palestine crisis. This correlation suggests a potential connection between political events happening and the increased engagement of Generation Islam during that period. They are quick to react and magnify their communication when significant grievances happen to Muslim populations around the world.

Topics and Discourses

Hashtags

Hashtags can give away what Twitter discourses actors want to penetrate. Let us have a look on how Generation Islam deals with hashtags and what that says about their communication. Let us start with their 50 most used hashtags:

# Creating a Document-Feature-Matrix that seeks out for hashtags as a pattern
tweet_dfm <- tokens(genislam$text, remove_punct = TRUE) %>%
  dfm() %>%
  dfm_select(pattern = "#*")

# Extract 50 most used hashtags
toptag <- names(topfeatures(tweet_dfm, 50))

# Display 50 Most used Hashtags
toptag <- names(topfeatures(tweet_dfm, 50))
toptag_table <- bind_cols("1-10" = toptag[1:10],
                          "11-20" = toptag[11:20],
                          "21-30" = toptag[21:30],
                          "31-40" = toptag[31:40],
                          "41-50" = toptag[41:50])

knitr::kable(toptag_table)
1-10 11-20 21-30 31-40 41-50
#stopmacron #syria #kopftuch #uyghurs #islamfeindlichkeit
#afghanistan #gaza #demohamburg #uiguren #savesilwan
#hanau #gazaunderattack #rassismus #kabul #islamhass
#palestine #idlib #freepalestine #kopftuchverbot #islamophobie
#china #ukraine #coronavirus #breaking #palestineunderattack
#ramadan #palestinians #corona #cdu #boycottfrance
#india #palestinewillbefree #jerusalem #einestages #alaqsaunderattack
#islam #taliban #israel #palästina #hijab
#madeinchina #afd #savesheikhjarrah #bds #skandalunion
#islamophobia #uyghur #france #delhi #xinjiang
# Lets do that for the 50 most used tweets and create a network
tag_fcm <- fcm(tweet_dfm)


# Lets display this network
topgat_fcm <- fcm_select(tag_fcm, pattern = toptag)
set.seed(123)
textplot_network(topgat_fcm, min_freq = 0.1, edge_alpha = 0.8, edge_size = 5,
                 edge_color="grey60")

We can distinguish four clusters in the network: The Israel-Palestine conflict (words adjacent to #palestine), contexts of political violence towards/involving Muslims (e.g. #syria, #afghanistan, #india), China and the Uyghurs (words adjacent to #china and #uyghurs), and Islamophobia, Racism, and the Right-wing (words adjacent to #islamophobie). The latter topic is very interesting as it references issues, actors, and political actions. Issues include headscarf bans (#kopftuchverbot), racism (#rassismus and the far-right terrorist attack in #Hanau-Germany), and Islamophobia (#islamhass and #islamophobia). These issues are linked to actors like France and the French president, Emmanuel Macron, the Christian Democratic Union in Germany (CDU), a conservative, center-right party that has led the federal government of Germany for most of the post-Second World War period, and Alternative for Germany (AfD), a German far- and radical-right party. Linking to issues of racism and Islamophobia as well, political actions and demands, namely #boycottfrance and #stopmacron, inform us in what relations some actors and issues are seen.


  1. In the case of the IGMG, the Twitter data of the youth organization was used as they tweet in Turkish far less than their parent organization. Certainly, youth organizations communicate differently than their parent organizations, but by including them we can capture the ideological setting of the organization as a whole without including a language that I am not able to analyze.↩︎