New approaches to the study of Muslim societies in West Africa
The African continent remains underrepresented in the field of Digital Humanities (DH). However, it has much to offer African history and the study of Muslim societies in West Africa in terms of alternative knowledge production, both for public scholarship and research.
In the medium term, the Collection will bring new perspectives to the scholarly conversation about Islam in West Africa by opening up avenues of inquiry through the lens of DH. Most digital history projects have tended to publish digitised primary sources without making explicit historical interpretations and arguments. Therefore, this project will not only digitise and catalogue a collection of sources on different countries, but will also explore computational methods to engage with printed sources to develop new arguments about the history of Islam and Muslims in the region.
The textual analysis of the dataset will examine how press coverage of Islam and Muslims in national newspapers has changed over the past 60 years. What are the main topics covered in Islamic publications and how has this changed over time? The results will provide new insights into the development of Muslim communities and the place of Islam in the wider society.
The Collection will also offer various visualisations of the material, highlighting the intertwined histories of Muslim communities in the region. A network diagram will highlight how geographically dispersed Islamic groups and actors travel and collaborate across West Africa.
Distant reading is a methodological approach in the social sciences and DH that uses computational techniques to analyse large textual datasets. It originated in literary studies, where it was introduced by Franco Moretti, and contrasts with "close reading", which involves in-depth analysis of individual texts. Distant reading allows researchers to identify patterns, themes and phenomena that may not be readily observable using traditional qualitative methods. Algorithms and statistical methods such as topic modelling, sentiment analysis and network analysis are commonly used in this approach.
It's worth noting, however, that distant reading is not without its limitations. The method's reliance on algorithms and computational models can inadvertently obscure nuances of meaning, context and tone that a close reading might capture. In addition, the quality and representativeness of the data can affect the reliability of the findings. As a result, researchers often use distant reading in conjunction with other methods to provide a more comprehensive analysis.
As a showcase of the possibilities offered by DH to analyse the data on the Islam West Africa Collection, the following visualisations have been generated using two corpus:
- 970 press clippings from 3 beninese newspapers (Daho-Express, Ehuzu, La Nation) published between 1970 and 2022, totalling 497,281 words;
- 2287 press clippings from 5 burkinabè newspapers (Carrefour africain, L'Observateur, L'Observateur Paalga, Sidwaya, Le Pays), published between 1962 and 2019, totalling 1,613,954 words.
Each row represents a single article. The DataFrame is structured as follows:
- The "Date" column contains the publication dates in the YYYY-MM-DD format
- The "Titre" column holds the titles of the articles
- The "Journal" column specifies the newspaper in which the article was published
- The "Keywords" column includes keywords that describe the article's content, separated by vertical bars and spaces (' | ').
- The "Location" column lists geographical identifiers such as country, region, or city names, also separated by vertical bars and spaces (' | ').
- The "Contenu" column contains the full text of each article.
- The "Lemmatized_Contenu" column features the lemmatized version of the article text, with stop words removed.
Jupyter notebooks are available on GitHub with the full Python code.