Saturday, July 5, 2014

Data Sources: Which Books Belong in the Bible?

Last weekend I saw the Book of Mormon musical and it reminded me of a joke I heard on a Jewish radio show: "Why did God create Mormons?" ... "So that Christians could understand how Jews feel." The joke implies that the New Testament is analogous to the Book of Mormon, which most Christians reject. Ironically, modern Judaism has its own "new testament", the Talmud, which also is analogous in some ways. In fact, many of the differences between various religions can be attributed to differences in the holy books they consider authoritative (i.e., their "bible" canon).

So which books belong in the Bible?

I think that is the wrong question to ask, and I think it comes from the natural (but often irrational) human desire for certainty, facilitated by concrete, black-and-white category distinctions. The problem is that even if the writings themselves are divinely inspired, inerrant, and infallible, our ability to recognize and classify them as such is not. Thus, instead of regarding particular books as part of an authoritative canon, I think it's more useful to regard all of them as data sources and treat them as such.

Treating them like any other data sources, my answer to "Which books belong in the Bible?" is "as many as can practically fit". That could include the Tanach ("Old Testament"), Apocrypha, New Testament, Talmud, Gnostic writings, Qur'an, Book of Mormon, Bhagavad Gita, Tripitaka, and many others. It includes some that are very accurate and useful, some that are spurious and useless, and some that are largely unreliable yet contain a few useful data points. In other words, it's a lot like the data sources used by scientists (e.g., for things like weather prediction).

Data doesn't have to be perfect to be useful, especially for probabilistic beliefs. Even datasets that partially contradict each other can have value. For example, the New Testament, Talmud, and Qur'an all agree on some things and disagree on others. Thus, we can have a relatively high level of confidence in beliefs & doctrines on which they all agree, and lower confidence in beliefs where they contradict each other. Of course, that in no way implies that they are equally true or should be given equal weight. Not at all.

It's impossible to read every book ever written about God, many of which contain mostly noise. My solution, as with other types of data, is to start with those that are the most accurate (according to history & archaeology), ancient, widely accepted, and relevant, then add more. Using my estimations, that usually means starting with the Torah and Nevi'im ("Law and Prophets" --  which also apparently were Jesus' primary written data sources). They are the most widely accepted and ancient, and they make claims of divine inspiration that can be scientifically tested. If there's room for more data, I then add the Ketuvim ("Writings"), ApocryphaNew Testament, and Mishnah. Then the Essene writings, Jewish Pseudepigrapha, early Jewish writers (Philo, Josephus, Targumim, etc.), Ante-Nicene Fathers, and Gemara. Beyond those, I think the data gets very noisy but still has value in some cases.

This methodology is much different from that of many Christians, Jews, and Muslims, who derive much of their theology from the more recent and less widely accepted books, then interpret (and sometimes translate!!) the Torah and Prophets through the lens of those. That method may still lead to correct theology, but I find it less justifiable from a scientific perspective, and it has a tendency toward circular reasoning.

Thinking of the "Bible" as a collection of data sources also illustrates how unreasonable and unscientific some of the objections to it are. For example, many arguments against belief in God focus on alleged errors and contradictions in the Bible, usually about very insigificant details. Others make a big deal about the rejection of certain non-canonical books and the fact that some canonical books weren't accepted until long after they were supposedly written. Assuming those objections are valid (which is debatable), they're basically equivalent to "a few data points aren't perfect, some of the data contains noise, and you threw out a few data points that maybe you should've kept". In other words, it's like practically every other dataset that scientists rely on.


  1. "... many arguments against Christianity focus on alleged errors and contradictions in the Christian bible, usually about very insigificant details in the gospels" - those arguments aren't necessarily against christianity, but rather are presented as evidence against the reliability of scriptures as a data source. I don't know your stance on the bible's being the word of a deity, but it seems to me that an omnipotent, omniscient deity would not write (or, inspire human authors to write) a history that contained such elements, even "insignificant" ones. Of course, if you accept the bible not as the inerrant word of god, but rather the work of humans writing about their religion, then these become not only plausible as errors and contradictions, but are a virtual necessity.

    1. Chuck, thanks for the comment. Some books of the Bible don't claim to be God's word, but I don't think it matters anyway. What matters is whether you're handling the data properly or not. Either it's reliable enough to move prior probabilities or it isn't. If a particular book in the Bible has minor errors, I'd give less weight to that particular book (and others by the same author), but I wouldn't throw the baby out with the bath water, just as with any other data source I work with, none of which are perfect. That's why I don't mind having lots of extra books in my "Bible", even some that I know to contain errors, because they still have some useful data points.

  2. Matt,

    In reading your latest entry, the analogy of the subjective map analysis kept coming to mind. Thanks in large part to your BLOG, I understand how accounts, parables and events in the Bible (whichever books you wish to examine) can be treated as data subject to probabilistic scrutiny. The same goes with a meteorological surface map, which contains a lot of explicit integers (temp, dew point, MSLP, change fields) and estimated numbers (wind barbs and direction to the nearest 5 kt and 5 or 10 deg respectively)--but unlike the Bible, no parables or commandments. :-)

    Compared to something as vast as the Bible, a surface map is ridiculously simple. Nonetheless, I can offer ten duplicates of a Great Plains surface chart to ten highly experienced, knowledgeable researchers and forecasters for analysis, ask them to analyze temp, pressure, dew point, and boundaries (fronts, drylines, outflows, etc.), and essentially guarantee you that results will differ, sometimes in major ways. No two will look precisely the same, with all features in identical positions. Cold, hard numbers--and still only loosely similar subjective analyses! Yet if all or most of those analyses show a cold front or dryline in a particular region, minor placement differences aside, we can be quite confident that it is bonafide. Given that, it amazes me that there can be as much general agreement as there is regarding large parts of the Bible: many hundreds of pages of literature where underlying numbers and sequences are abundant but so are parables, riddles, gaps, (apparent) contradictions and translational differences.

    In short, one of the most hackneyed arguments God-deniers like to make about the Bible--that those uncertainties, missing pieces, apparent errors, and mysteries must somehow nullify the God behind it, can be used to throw out surface analyses as a valid means of diagnosing and understanding part of the atmosphere. I think not. I'll keep analyzing those surface and upper-air charts and examining the Bible, to understand better (if never perfectly) the atmosphere's diagnostic state and God's ways, respectively.

    Other than that, I don't have anything particularly useful to offer at this stage, except that I am deeply intrigued and intensely curious about where this course of thinking (books of Abrahamic religions as data) will lead. Thanks for the insights.

    1. Thanks very much Roger!! I totally agree about the inconsistent standards. And I love your surface analysis analogy. It fits very well in so many ways. I'll add that I'd much rather have ten slightly different analyses than ten that are identical in every way, especially if I'm making a probabilistic forecast.