By : Muhammad Haziq bin Abd Rahim. Special thanks to Dr Ahmad Sanusi Azmi from USIM and Mr Zafree Zaharidan (International Mathematical Olympiad Malaysia) for reviewing this paperwork. PDF version here

### Abstract

Tradition holds it that Imam al Bukhari, the most prolific scholar of hadith in the entirety of Islamic history, was able to memorize up to six hundred thousand hadiths. While this account is considered a marvel of his memorization prowess, it is also a source of critique and critical scrutiny. Polemicists often claim that this figure is unrealistically high and mathematically impossible to be achieved, while poking holes in the narrative of a culture of trustworthiness of hadith tradition. After all, if only less than 1 percent of hadiths out of a total of 600 thousands were selected to be sahih by Bukhari, then that would mean that 99 percent of the total corpus of hadiths were either forged or deemed unacceptable, as they would claim. While some responses were made to address this seemingly paradoxical issue, they revolve mainly on defending the possibility of this figure, but none so far trying to show mathematically how it is reached. In this paper, I am presenting the solution to the problem of Bukhari’s memorization of 600 thousand hadiths by using combinatorics. It is found that not only by using combinatorics assessment of isnads we could easily reach figures of realistically memorized hadiths up to hundreds of thousands, I also maintain this is indeed the only explanation to the problem.

### Background of Critique

The tradition of memorization of hadiths is a main element in the development of ulum al hadith since its advent after the death of the prophet . The credibility of scholars of hadith was often assessed by their ability to memorise a large corpus of hadiths (Siddiqi, 1993 ). The number of hadiths memorised by one individual scholar of hadith is said to range between tens of thousands to literal millions . Al Bukhari claimed that he made his collection of traditions out of six hundred thousand. According to Ibn Hanbal’s statement, over 7,000,000 traditions were sound, of which 6,000,000 were memorised by Abu Zur’ah (Azami, 1978).

At first glance, the figures presented seem to indeed be too high, especially when put into context. The number of verses of the entirety of the Quran is a hundred order of magnitudes lower. More perplexingly, Bukhari’s book contains only 7397 hadith with repetition, and only 2602 hadiths without repetition (Khan, 1997). That is less than 0.5% of the entirety of the hadith pool as chosen by Bukhari.

Critiques of Islam had problematized this issue. The most notable one arguably is a young Egyptian youtuber or blogger named Sherif Gaber (Gaber, 2017), whose video titled ‘the lies of al-Bukhari in Islam’ raked millions of views. This exposition by him, albeit faulty in all possible ways, introduced the doubt of the possibility of Bukhari’s hadith memorization to general masses, no longer is this merely an academic topic to be discussed only in ivory towers.

Critiques of Islam have scrutinised this issue from several angles, 1) by putting doubt that this figure is even true and realistically achievable in the first place, thus undermining Bukhari himself, who is supposed to be the paragon of trustworthiness, as he purposefully over bloated the amount of hadith he memorised, 2) by claiming that because an overwhelmingly large corpus of hadiths, up to 99 percent are not accepted as reliable, then that would put a serious doubt on the entire science of hadith.

Point no 1) is argued from a number of contentions, but the only ones worthy to be discussed are those mathematical in nature as follows:

- It is argued that even given a generously high rate by which Bukhari would examine 14 hadiths daily, not taking into account his other commitments like travelling, meeting other narrators, etc, it would take him 110 years to finish examining all hadiths, let alone taking into account the process of sifting through all of them, filtering and finally writing. Bukhari however took only 16 years to finish his Sahih.
- It is simply far too unrealistic for a single person to memorise up to 38,000,000 words which amount to committing to memorising some 240 volumes of books, according to some method of calculations by which critics assume an average amount of words per hadith multiplied by 600,000. The accuracy of the guesstimate is questionable, but for the sake of discussion it is to be accepted as is in this paper. .

Point no 2) assumes that all 600 thousands of hadiths are separate, unique accounts of prophetic statements. Thus if only 0.5 percent of all prophetic accounts are indeed non-forgeries or reliable, then the entire social climate by which the early Muslims lived is simply too hostile for any reliable reliance on hadith.

The same point is made by Guillaume: “Bukhari’s biographer says that he selected his material from no less than 600,000 hadith. If we allow for repetitions which occur under different heads, he reduced this vast number of forgeries or dubious reports to less than 3000 hadith. In other words, less than one in every 200 traditions which circulated in his day could pass his test” (Guillaume, 1956, 91).

Next, we are going to look at some responses made by Muslim apologists to counter this charge of unreliableness.

### Responses to the Critique by Muslim Apologists

**Note : The word ‘apologist’ intended here is not by any stretch negative in connotation, as the word is often perceived. It comes from the Greek word ‘apologia’ – “a well-reasoned reply; a ‘thought-out response’ to the accusations made’ *(*Etymonline*, 2017)*, thus here apologists would refer to those who defend the faith.*

In addressing the critique, Muslim apologists resorted to several angle of arguments:

- They clarified that Sahih Bukhari is in fact not an exhaustive collection of all the sahih hadiths, Imam Bukhari himself did not intend that. The 2602 hadiths are but a mere fraction from the entire collection of sihahs (Talaat, 2018) .
- They rebutted the charge of unrealistic memory feat by invoking the scientific discussion on the memory capacity of the brain (Ezzeldiin, 2017) .
- They contend that critiques’ analysis often only assume that all the hundreds of thousands of hadith consist of separate matn (narration) and always fail to account for another main factor of ulum al hadith – the science of isnad (Ezzeldiin, 2017).

In short, the responses by Muslims seek to open up the door of possibility to allow for such extreme feat of memorisation to be possible, albeit admitting that it is indeed extraordinary. Some humans in recorded history really did have extraordinary achievements far ahead of the curve, so why deny Bukhari’s?

However, it is my own personal assessment that these responses did not manage to demonstrate how the figures came to be. Indeed we are supposed to account for isnads, but how is it calculated such that we can reliably reach hundreds of thousands in numbers, a factor of a hundred to the total number hadiths found in kutub sittah, without repetition? (Manna, 2020)

In fact, scholars also found these figures to be perplexing, Mustafa al Azami in his book Studies in Early Hadith Literature (1978) says:

“The actual number of traditions preserved in the Sihah and the other collections is only a small fraction of the body of the traditions described above. This is a puzzling problem. Many scholars have been perplexed, and so have reached very strange conclusions”

Al Zarkashi (d.794H) brought forth two possible interpretations for this issue of hundreds of thousands hadith memorised; it could mean the multiplicity of hadiths with their different channels and chains or something more general including prophetic hadiths and the sayings of the companions and salaf (al Zarkashi)

It is presumed that 1 matan having 10 different channels is considered as 10 separate hadiths. But even if every hadith in Sahih Bukhari is to have 10 separate channels, it is still shy by a factor of 10. Moreover, an overwhelming portion of hadith corpus consists of hadith Ahad (Hallaq, 1999) . This explanation does not seem sufficient.

### Bukhari’s Memorization Test

A key story to understand this issue lies in the famous account of a test the scholars of Baghdad put onto Bukhari as he arrived there. As told by Ibn Kathir (d.774H) in Ikhtisar Ulum Al Hadith:

The first is the account we heard of al-Bukhari when he arrived in Baghdad. Where they placed the chain of one narration with a different text, and placed the text of a hadith with a different chain. Another example is a hadith which is related from Salim from Nafi’, and it being related from Nafi’ from Salim; this is from the second type. In this form there are approximately one hundred hadith or more, when they were read upon him he returned each hadith to its chain, and each chain to its text. They conceded his superiority.

From Suhaib Hassan:

The traditionists, in order to test their visitor, al-Bukhari, appointed ten men, each with ten ahadith. Now, each hadith (text) of these ten people was prefixed with the isnad of another. Imam al-Bukhari listened to each of the ten men as they narrated their ahadith and denied the correctness of every hadith. When they had finished narrating these ahadith, he addressed each person in turn and recounted to him each of his ahadith with its correct isnad. This trial earned him great honour among the scholars of ahadith (Hasan, 1995)

This particular story is used as an example of a category of hadith called Maqlub (reversed). According to Ibn Kathir:

A hadith is known as maqlub (changed, reversed) when its isnad is grafted to a different text or vice versa, or if a reporter happens to reverse the order of a sentence in the text

Al Bayquni (d 1080 AH) categorized Maqlub hadith into two: Maqlub al sanad (reversal in chain of narration) and Maqlub al Matan (reversal in the narration)

Maqlub al sanad may happen in these situations:

- A change in the narrator.
- A change in the name or the of the narrator.
- A reversal in the sequence of the chain of narrator.

While Maqlub al matan may happen in these situations:

- When a part of the matan is put in front of the other part while it should be the other way around.
- When a sanad of a hadith is changed upon another matan that is not supposed to be.

A very important key point in the story of Bukhari’s memorization trial is that all one hundred of the faulty Maqlub hadiths, having their arrangements scrambled between their sanads and matans, and quite possibly between their narrators internally within a chain, are still called and identified as hadiths. Indeed a maqlub hadith could either be classified as weak (daif) or even forged (maudhu’) but it is still considered as a hadith (Ibn Kathir).

This story also tells us of the nature of the memorization of Bukhari. Rather than being of a brute-forcing nature akin to memorising the sequence of pi numbers, Bukhari’s memorisation of hadiths is leaning more to a contextual, logical one. A hadith with a reversed chain of narrators cannot be right because one narrator is born after the other, two particular narrators cannot be in the same chain because they never meet each other etc. This still requires amazing memory fortitude and intense mastery of the biographies of the narrators as well as the narrations, but this line of thinking would shed a more rational light onto the question of the possibility of memorising six hundred thousand hadiths.

Now, we come to the question that is the most fundamental to the issue of extremely high number of hadiths in circulation, and the ones recorded to have been memorised:

**If different arrangements of matans and isnads and their components are indeed still considered hadiths, how many different arrangements are possible, and how many of these arrangements were accounted for and were in circulation?**

Analysing this problem calls for the mathematical tool of combinatorics to be utilised.

### Calculation of possible permutations of chains of hadiths

Hadith is a combinatorially sensitive discipline. Only one particular arrangement of matan and its chain of narrators may be considered accepted (maqbul) and the others rejected (mardud) . The order matters and each narrator in a sanad chain (should) only occurs once without repetitions.

Let us consider a general case of a single sanad chain hadith consisting of five narrators in a chain:

Figure 1: A sanad chain consisting of 5 narrators

How many different permutations are possible? A tedious way to find out is to list out all the permutations and count one by one. But fortunately we can just use factorial for permutation without repetition.

5! = 120 different permutations. Where only one is correct.

But we need to consider as well the case of severance (al inqita’), of incomplete isnad, where one or two (or more) links in the isnad is severed.

Figure 2: A sanad chain with one narrator being severed

Consider the case of a single narrator in a chain being omitted. Meaning we are to choose 4 from a pool of 5, no repetitions, order matters. How many permutations are there?

P(4,5) = 120 permutations

Figure 3: A sanad chain with two narrators being severed

Consider the case of two narrators in a chain being omitted. Meaning we are to choose 3 from a pool of 5, no repetitions, order matters. How many permutations are there?

P(3,5) = 60 permutations

So far for a single chain of hadith with 5 narrators we already have some 300+ permutations. However, some hadiths may have multiple sanad chains (Rahmadi et al., 2022).

Figure 4: A hadith with multiple sanad chains

In addition to their individual permutations, every channel of transmissions is counted as separate hadith as discussed by Azami (1978).

‘Abd Rahman bin Mahdi (d 198) says “I have thirteen traditions from al Mughirah transmitting from the Prophet, concerning ‘al-mash ‘ala al Khuffain”. It is obvious that al Mughirah is reporting a single action or habit of the Prophet. It does not matter how many times this action was repeated. It would be reported as a single action. As this single action is reported to ‘Abd al Rahman b. Mahdi from thirteen channels, he counts them as thirteen traditions”

Taking into account possible permutations from these multiple sanad chains, the number of permutations for one narration could very well reach a thousand or more.

Recall as well that there are also situations of inqilab in each of the narrators themselves. For example one of the narrators is Sa’ad bin Sinan, but mistakenly named as Sinan bin Sa’ad. There are cases of interchanging narrators unrelated or external to the sanad, for example rather than narrating from Zuhri from Abi Salamah, the narrator mistakenly said Zuhri from Sa’id (Ahmad Ayoup, 2016). Because these kinds of mistakes have no definite limitation, that is the mistaken name could stem from the entire set of all the existing narrators, or some other random names, the possible permutations are not countable.

If we are to include maqlub al – matan, that is, inversion in matan, then the number of possible permutations may reach astronomical amounts. It is also worth pointing out that many hadiths in sahih bukhari have the number of narrators of more than five, with the longest is said to consist of nine narrators (AbdulWahid & Ahmad, 2008).

However, employing conservatism, let us grant that the number of permutations for one chain of hadith narration to be exactly 100. As shown above, this amount is of a much lower end. If Bukhari only memorised some 6000 unique non repeated narrations with single isnads, then the total easily amounts to 600 thousand total permutations of hadiths.

From these 600 thousands, because of the nature of permutations, we can easily understand the assertion that a large number of them to be non sahihs.

### Is Permutation Memorization?

Let us assume that we have a computer holding a complete database of hadiths. And let us imagine holding a trial akin to what Bukhari went through in Baghdad.

We give the computer a narration with its accompanying sanad, and we ask it to verify and perhaps correct the sanad if required. The catch is, the sanad is one of the rejected maqlub permutations.

Figure 5: An example of one of a wrong sanad permutation

If the database system is a simple, and naive one, without any machine learning employed perhaps, then the computer needs to first find the narration in its database, then going through hundreds of entries of its permutations, where each permutations of sanads are already graded as either correct or incorrect, then finding out that the permutation is indeed incorrect and finally giving out the correct one.

Using this model, then we might say that indeed, each entry of the permutation of sanad for that particular narration will be considered hadith, albeit overwhelmingly majority are of the rejected ones. The database of the totality of hadith would therefore be extremely massive reaching perhaps billions of entries.

Obviously, a better, smarter system of database needs to be developed for such a machine to handle that sort of trial. One that is much more context sensitive, and can be trained to find patterns rather than brute forcing through billions of data points.

From this point of view, we may consider the early muhaddithin as a living intelligent database of hadith, those who can amazingly infer patterns and by inculcating and contextualising other external informations such as the biographies of narrators, they are able to analyse and evaluate seemingly endless permutations of hadiths.

### Possibility of Combinatorics As the Method Used By Bukhari to Calculate the Total Number of Hadiths Memorized

What is the likelihood that this very method of calculating permutations is the one employed by Bukhari to come out with the figure of 600 thousand hadiths memorised? Did he know of this method of calculation?

A cursory look into the history of combinatorics reveals that it has been developed since 16th century BC, and in the Arab world al-Khalil ibn Ahmad al-Farahidi who lived 92 years before Bukhari employed it in the study of language to find the possible arrangements of letters to form syllables (Alkiyumi, 2023). It is therefore very much plausible that combinatorics has been used by the muhaddithin to find out the possible arrangements of narrators in isnads. Although the actual usage by them in their writings and traditions may be found in further research.

The number of hadiths memorised as claimed by the muhaddithin, reaching hundreds of thousands or even millions may not necessarily be purely theoretical. That is, it may not stem purely from calculation of possible permutations. Because in a hadith with 9 narrators in its chain, the amount of permutations (9! = 362880) is already extremely large. It may be both theoretical and practical; the muhaddithin in their tens of years of learning and teaching hadiths found this kind of permutational mistakes happening all the time amongst their students, hadiths practitioners and the common mass and they decided to account for them in their biological database residing in their brains and hearts.

### Conclusion and Further Research

In summary, the traditional account of the seemingly extremely high number of memorised hadith by muhaddithin, in particular Bukhari, may be explained by taking into account combinatorial consideration. It is shown that the famous story of hadith trial undertaken by Bukhari serves as a definite proof of the element of combinatorics with regards to hadith memorisation. Extending into this, it is shown that the number of permutations of hadith chains easily reach hundreds of thousands in number, and each is considered hadith. Looking at this issue from the angle of combinatorics also helps to explain how there can be an overwhelmingly large proportion of weak or rejected hadiths compared to sahih or accepted ones. It is not because of a culture of mass hadith forgeries, but rather a byproduct of mathematical analysis.

Perhaps an interesting avenue of research would be to develop an inteligent dynamic hadith database which can pass Bukhari’s test and compare the structure of its data processing framework with the model of the human mind. That could potentially help shed more light into this perplexing question of the numbers of hadith memorised by the muhaddithin.

### References

AbdulWahid, A., & Ahmad, N. (2008). *The longest Chain with al Bukhari*. Subulassalaam. https://subulassalaam.com/articles/article.cfm?article_id=67

Ahmad Ayoup, M. M. (2016). *IRSYAD AL-HADITH SIRI KE-89 : HADITH MAQLUB*. Pejabat Mufti Wilayah Persekutuan. https://muftiwp.gov.my/en/artikel/irsyad-al-hadith/1088-irsyad-al-hadith-siri-ke-89-hadith-maqlub

al Bayquni, U. (1999). *Al-Manzouma Al-Biquniyyah* (1st ed.). Dar Al-Mughni.

Alkiyumi, M. (2023, 4 20). The creative linguistic achievements of Alkhalil bin Ahmed Al-Farahidi, and motives behind his creations: A case study. *Cogent Arts and Humanities*, *10*(1).

al Zarkashi, B. a.-D. (1998). *An-Nukat ‘ala Muqaddimah Ibn Salah* (1st ed.). Adwaa al Salaf.

*apologist | Etymology of apologist by etymonline*. (2017, September 28). Etymonline. Retrieved September 5, 2024, from https://www.etymonline.com/word/apologist

A??am?, M. M. (1978). *Studies in Had?th Methodology and Literature*. American Trust Publications.

Ezzeldiin, S. (2017). *Impalement: The atheist sherif gaber & bukhari mistake*. youtube.com. https://www.youtube.com/watch?v=vH8LbIGVYaY&t=834s

Gaber, S. (2017). *The Lies of Al-Bukhary in Islam*. youtube.com. https://www.youtube.com/watch?v=qVCZ4FjYL0g&t=666s

Guillaume, A. (1956). *Islam*. Penguin Books.

Hallaq, W. (1999). The Authenticity of Prophetic ?adîth: A Pseudo-Problem. *Studia Islamica*, *89*, 75-90. https://doi.org/10.2307/1596086

Hasan, S. (1995). *An Introduction to the Science of Hadith* (1st ed.). Darussalam.

Ibn Kathir, A. a.-F. (n.d.). *Ikhtisar Ulum al Hadith*. Kalemah.

Khan, M. M. (1997). *?a??? Al-Bukh?r?: The Translation of the Meanings of Sahih Al-Bukhari : Arabic-English* (M. M. Khan, Trans.). Darussalam.

Manna, M. (2020). *Brief Information About The Kutub As-Sitta*. Saheehus-Seerah. authenticseerah.com

Rahmadi, Y., Dini, S. K., Achmad, F., & Atina, A. (2022). Exploring the relationship between hadith narrators in Book of Bukhari through SPADE algorithm. *MethodsX*, *9*. https://doi.org/10.1016/j.mex.2022.101850

Siddiqi, M. Z. (1993). *?ad?th Literature: Its Origin, Development and Special Features* (A. H. Murad, Ed.). Islamic Texts Society.

Talaat, H. (2018). *Sherif Gaber: the icon of Atheism and ignorance*. youtube.com. https://www.youtube.com/watch?v=G5AFecCclgQ