Theological Data Mining: July 2014

Last weekend I saw the Book of Mormon musical and it reminded me of a joke I heard on a Jewish radio show: "Why did God create Mormons?" ... "So that Christians could understand how Jews feel." The joke implies that the New Testament is analogous to the Book of Mormon, which most Christians reject. Ironically, modern Judaism has its own "new testament", the Talmud, which also is analogous in some ways. In fact, many of the differences between various religions can be attributed to differences in the holy books they consider authoritative (i.e., their "bible" canon).

So which books belong in the Bible?

I think that is the wrong question to ask, and I think it comes from the natural (but often irrational) human desire for certainty, facilitated by concrete, black-and-white category distinctions. The problem is that even if the writings themselves are divinely inspired, inerrant, and infallible, our ability to recognize and classify them as such is not. Thus, instead of regarding particular books as part of an authoritative canon, I think it's more useful to regard all of them as data sources and treat them as such.

Treating them like any other data sources, my answer to "Which books belong in the Bible?" is "as many as can practically fit". That could include the Tanach ("Old Testament"), Apocrypha, New Testament, Talmud, Gnostic writings, Qur'an, Book of Mormon, Bhagavad Gita, Tripitaka, and many others. It includes some that are very accurate and useful, some that are spurious and useless, and some that are largely unreliable yet contain a few useful data points. In other words, it's a lot like the data sources used by scientists (e.g., for things like weather prediction).

Data doesn't have to be perfect to be useful, especially for probabilistic beliefs. Even datasets that partially contradict each other can have value. For example, the New Testament, Talmud, and Qur'an all agree on some things and disagree on others. Thus, we can have a relatively high level of confidence in beliefs & doctrines on which they all agree, and lower confidence in beliefs where they contradict each other. Of course, that in no way implies that they are equally true or should be given equal weight. Not at all.

It's impossible to read every book ever written about God, many of which contain mostly noise. My solution, as with other types of data, is to start with those that are the most accurate (according to history & archaeology), ancient, widely accepted, and relevant, then add more. Using my estimations, that usually means starting with the Torah and Nevi'im ("Law and Prophets" -- which also apparently were Jesus' primary written data sources). They are the most widely accepted and ancient, and they make claims of divine inspiration that can be scientifically tested. If there's room for more data, I then add the Ketuvim ("Writings"), Apocrypha, New Testament, and Mishnah. Then the Essene writings, Jewish Pseudepigrapha, early Jewish writers (Philo, Josephus, Targumim, etc.), Ante-Nicene Fathers, and Gemara. Beyond those, I think the data gets very noisy but still has value in some cases.

This methodology is much different from that of many Christians, Jews, and Muslims, who derive much of their theology from the more recent and less widely accepted books, then interpret (and sometimes translate!!) the Torah and Prophets through the lens of those. That method may still lead to correct theology, but I find it less justifiable from a scientific perspective, and it has a tendency toward circular reasoning.

Thinking of the "Bible" as a collection of data sources also illustrates how unreasonable and unscientific some of the objections to it are. For example, many arguments against belief in God focus on alleged errors and contradictions in the Bible, usually about very insigificant details. Others make a big deal about the rejection of certain non-canonical books and the fact that some canonical books weren't accepted until long after they were supposedly written. Assuming those objections are valid (which is debatable), they're basically equivalent to "a few data points aren't perfect, some of the data contains noise, and you threw out a few data points that maybe you should've kept". In other words, it's like practically every other dataset that scientists rely on.

Theological Data Mining

Saturday, July 5, 2014

Data Sources: Which Books Belong in the Bible?