Temporal analysis

Temporal analysis can reveal important trends and patterns in the corpus over time, how certain themes or topics have gained or lost prominence.

Top 10 most frequent keywords

The graphs presented here show the yearly frequency of the top ten most frequent keywords in each corpus, as derived from the Dublin Core Subject metadata. This visualisation offers insightful observations on the dynamics of keyword prevalence, highlighting three main trends:

Variability: The frequency of some keywords remains relatively stable across years, while others show marked fluctuations. This variability may reflect changes in the focus of media coverage or shifts in public interest.
Dominance: Certain keywords appear consistently more often than others, underlining their continued relevance or prominence within the corpus. This dominance suggests that these issues are central to the discourse captured in the dataset.
Temporal shifts: Notable spikes or dips in the frequency of certain keywords in particular years may signal broader socio-political or cultural changes that influence discourse. These temporal shifts provide valuable clues to understanding the context and evolution of the topics discussed.

It is important to note that the attribution of these keywords was done manually and the selection is not exhaustive. Therefore, the analysis presented here should be seen as an initial overview that deserves further, more detailed examination to fully appreciate the complexities and subtleties of the dataset.

A comparative analysis of keyword frequency trends in Benin and Burkina Faso reveals both convergences and divergences that reflect the media landscape of each country. In both countries, Islamic holidays, in particular Aïd al-Adha, Ramadan and Aïd el-Fitr, emerge consistently in the media discourse, suggesting a significant coverage of these observances. The observed variability in keyword frequencies in both datasets may signal shifts in societal interests, media narratives, or the impact of external events in the region.

However, the data also highlight different aspects within each country's media focus. Burkina Faso's corpus specifically highlights the figure of Oumarou Kanazoé and themes of terrorism, with a marked escalation in the frequency of keywords "Terrorism and radicalization" in recent years. This increase suggests that such events have had a significant impact on public discourse. Interestingly for Benin, despite growing concerns about the infiltration of jihadist movements in the Gulf of Guinea region, this issue is not prevalent in the Beninese data.

Furthermore, the presence of terms such as Association des Élèves et Étudiants Musulmans au Burkina and Islamic faith-based education in the Burkina Faso graph underlines a strong interest or influence in Islamic education and Islamic organisations. This aspect is less pronounced in Benin's discourse, suggesting possible differences in the social and educational structures relating to Islamic communities in the two countries.

Such differences and patterns are not only indicative of different national narratives, but also of the different degrees of emphasis given to certain social, political and religious issues in the media of each country.

Python code

To ensure the transparency and reproducibility of our analysis, the following Python code snippet demonstrates how we interacted with the IWAC API to retrieve the necessary data and perform the keyword frequency analysis.

	import requests
	import pandas as pd
	import plotly.graph_objs as go
	from tqdm.auto import tqdm
	from collections import Counter
	from concurrent.futures import ThreadPoolExecutor, as_completed


	# Function to fetch data from a single item set
	def fetch_data(api_url, item_set_id):
	page = 1
	items = []
	while True:
	response = requests.get(f"{api_url}/items", params={"item_set_id": item_set_id, "page": page})
	data = response.json()
	if data:
	items.extend(data)
	page += 1
	else:
	break
	return items


	# Function to fetch and process data for all item sets in a country
	def fetch_and_process_data(api_url, item_sets):
	all_items = []
	# Use ThreadPoolExecutor to parallelize requests
	with ThreadPoolExecutor(max_workers=5) as executor:
	future_to_id = {executor.submit(fetch_data, api_url, id): id for id in item_sets}
	for future in as_completed(future_to_id):
	all_items.extend(future.result())

	# Process items to extract subjects and date
	processed_data = []
	for item in all_items:
	subjects = [sub['display_title'] for sub in item.get('dcterms:subject', []) if sub.get('display_title')]
	date = item.get('dcterms:date', [{}])[0].get('@value')
	for subject in subjects:
	processed_data.append({
	'Subject': subject,
	'Date': pd.to_datetime(date, errors='coerce')
	})

	return pd.DataFrame(processed_data)


	# Function to create an interactive keyword graph for each country
	def create_interactive_keyword_graph(df, country, output_filename):
	top_keywords = Counter(df['Subject']).most_common(10)
	top_keywords = [keyword for keyword, count in top_keywords]

	df_top_keywords = df[df['Subject'].isin(top_keywords)]
	df_grouped = df_top_keywords.groupby([df_top_keywords['Date'].dt.year, 'Subject']).size().reset_index(
	name='Frequency')

	fig = go.Figure()
	for keyword in top_keywords:
	df_keyword = df_grouped[df_grouped['Subject'] == keyword]
	fig.add_trace(go.Scatter(
	x=df_keyword['Date'],
	y=df_keyword['Frequency'],
	mode='lines+markers',
	name=keyword
	))

	fig.update_layout(
	title=f"Annual Frequency of Top 10 Keywords in {country}",
	xaxis=dict(title="Year", rangeslider=dict(visible=True), type="date"),
	yaxis=dict(title="Frequency")
	)

	fig.write_html(f"{output_filename}_{country}.html", full_html=True, include_plotlyjs='cdn')


	# Example usage
	api_url = "https://iwac.frederickmadore.com/api"
	country_item_sets = {
	"Bénin": ["2187", "2188", "2189"],
	"Burkina Faso": ["2200", "2215", "2214", "2207", "2201"]
	}

	# Process and create graphs for each country
	for country, item_sets in country_item_sets.items():
	df = fetch_and_process_data(api_url, item_sets)
	create_interactive_keyword_graph(df, country, "top_keywords_graph")
	tqdm.write(f"Interactive graph has been created for {country}.")

view raw IWAC_top_keywords_subjects.py hosted with ❤ by GitHub

Multiple keyword comparison

These graphs show the annual frequencies of selected keywords (topics, Islamic associations and Muslim leaders), allowing for comparative analysis.

Python code

	import requests
	import pandas as pd
	import plotly.graph_objs as go
	from tqdm.auto import tqdm
	from concurrent.futures import ThreadPoolExecutor, as_completed

	def fetch_data(api_url, item_set_id):
	page = 1
	items = []
	while True:
	response = requests.get(f"{api_url}/items", params={"item_set_id": item_set_id, "page": page})
	data = response.json()
	if data:
	items.extend(data)
	page += 1
	else:
	break
	return items

	def fetch_title_for_id(api_url, keyword_id):
	response = requests.get(f"{api_url}/items/{keyword_id}")
	data = response.json()
	return data.get('dcterms:title', [{}])[0].get('@value', 'Unknown Title')

	def fetch_and_process_data(api_url, item_sets, selected_keyword_ids):
	all_items = []
	with ThreadPoolExecutor(max_workers=5) as executor:
	future_to_id = {executor.submit(fetch_data, api_url, id): id for id in item_sets}
	for future in tqdm(as_completed(future_to_id), total=len(item_sets), desc="Fetching item sets"):
	all_items.extend(future.result())

	processed_data = []
	selected_keyword_ids_set = set(map(int, selected_keyword_ids)) # Convert to set of integers for faster lookup
	for item in tqdm(all_items, desc="Processing items"):
	subjects = item.get('dcterms:subject', [])
	date = item.get('dcterms:date', [{}])[0].get('@value')
	for subject in subjects:
	if subject.get('value_resource_id') in selected_keyword_ids_set:
	processed_data.append({
	'Subject': subject['display_title'],
	'Date': pd.to_datetime(date, errors='coerce'),
	'ID': subject['value_resource_id']
	})

	return pd.DataFrame(processed_data)

	def create_interactive_keyword_graph(api_url, df, selected_keyword_ids, output_filename):
	if df.empty:
	print("No data available for the selected keyword IDs.")
	return

	# Fetch titles for the keywords
	keyword_titles = {str(id): fetch_title_for_id(api_url, id) for id in tqdm(selected_keyword_ids, desc="Fetching titles")}

	df_grouped = df.groupby([df['Date'].dt.year, 'Subject', 'ID']).size().reset_index(name='Frequency')

	fig = go.Figure()
	for keyword_id in selected_keyword_ids:
	if keyword_id in df['ID'].astype(str).unique():
	subject_title = keyword_titles[keyword_id]
	df_keyword = df_grouped[df_grouped['ID'] == int(keyword_id)]
	fig.add_trace(go.Scatter(
	x=df_keyword['Date'],
	y=df_keyword['Frequency'],
	mode='lines+markers',
	name=subject_title # Use the fetched title as the trace name
	))
	else:
	print(f"No data found for ID {keyword_id}. Skipping this ID.")

	fig.update_layout(
	title="Annual Frequency of Selected Muslim Leaders in Burkina Faso",
	xaxis=dict(title="Year", rangeslider=dict(visible=True), type="date"),
	yaxis=dict(title="Frequency"),
	legend_title="Keyword Title"
	)

	fig.write_html(f"{output_filename}.html", full_html=True, include_plotlyjs='cdn')
	print(f"Interactive graph has been created. File saved as '{output_filename}.html'")

	# Example usage
	api_url = "https://iwac.frederickmadore.com/api"
	all_item_sets = ["2200", "2215", "2214", "2207", "2201"]
	selected_keyword_ids = ["898", "861", "944", "960", "947", "855", "1102", "1053", "912"]

	df = fetch_and_process_data(api_url, all_item_sets, selected_keyword_ids)
	create_interactive_keyword_graph(api_url, df, selected_keyword_ids, "selected_keywords_graph")

view raw IWAC_selected_keywords_subjects.py hosted with ❤ by GitHub

Islam West Africa Collection

Temporal analysis

Top 10 most frequent keywords

Python code

Multiple keyword comparison

Python code