Theological Data Mining: 2014

Sunday, October 26, 2014

Is God Supernatural?

Christian apologists often criticize atheists for presupposing a naturalistic worldview, thereby making the supernatural (and specifically, God) a virtual impossibility. I disagree with this argument for one of two reasons: A) because God is not necessarily supernatural, or B) atheists do believe in the supernatural. Whether it's A or B depends on the meaning of "supernatural".

I'm not interested in semantic arguments, but I think Merriam-Webster's definitions of "supernatural" helpfully describe ways that people think about God. In reverse order, I'll apply them to the question: "Is God Supernatural?"

Definition #3: "of, relating to, or being God"

God certainly is supernatural by this definition, but I don't think it's a useful one. It says nothing about the key issue, which is the nature of God and his interactions with the world.

Definition #2: "being so extraordinary or abnormal as to suggest powers which violate the laws of nature"

I think this is what many people, especially atheists and fundamentalist Christians, think of when they hear "supernatural". God "miraculously" intervenes in the world such that the laws of nature are violated. After all, God is omnipotent, so he's not bound by such laws. Naturalistic explanations of miracles are seen as denying God's power or avoiding the "plain meaning" of the Bible.

It's problematic if not impossible to find any evidence that the laws of nature have ever been violated. Even if it existed, the prior probability would be so low that the evidence would have to be extraordinarily strong. We don't have such evidence. And if we did, we'd simply modify our understanding of the natural laws. Then the supernatural would still be impossible.

Furthermore, I don't see anything in the Bible that necessarily would've violated natural laws. Wherever God's means are mentioned, it's always something that can be explained within the bounds of natural laws. The flood? Genesis says it rained. Parting of the Red Sea? Exodus says strong east wind. God speaking to prophets? They say visions and dreams. Where the means aren't mentioned (as is usually the case), there's no reason to assume violation of natural laws. Rather, in the absence of very strong evidence to the contrary, we should assume that natural laws weren't violated. To not do so would be to commit the base rate fallacy.

Using Definition #2 and considering the available evidence, I conclude that God (as described in the Bible) probably isn't supernatural. Can't rule it out, but the base rate and evidence suggest a low probability.

Definition #1: "of or relating to an order of existence beyond the visible observable universe"

I think this one is by far the most useful. According to this definition, God is indeed "supernatural". And so are other unobserved things like dark matter, strings (as in string theory), strangelets, preons, photinos, gravitons, life on other planets, other dimensions and universes, etc. These generally are not "gods of the gaps". They are reasonable hypotheses that explain what we can observe and aren't inconsistent with what we know about the laws of nature. I think the same is true of God.

Some things that once were "supernatural" (by Definition #1; e.g., viruses, atoms, distant planets) are no longer supernatural now that we have ways to observe them. Until we find a way to observe God or disprove the God hypothesis, we should carefully and scientifically consider the possibilities. We shouldn't allow our definition of "supernatural" to dictate our understanding of God or deny his existence. Rather, we should strive to have a view of God that best explains the data and is consistent with what we discover about the natural world.

“It is evident that an acquaintance with natural laws means no less than an acquaintance with the mind of God therein expressed.”
- James Prescott Joule, father of thermodynamics

Monday, October 13, 2014

Why So Many Scientists Are Atheists

According to Pew Research, 17% of scientists identify as atheist, compared to only 2% of the general population. 41% of scientists say they don't believe in God or a higher power, compared to only 4% of the general population. Why are so many scientists atheists?

Some say it's because science disproves God or that science and religion are somehow incompatible. Others point out that most were atheists before they ever became scientists, suggesting they pursue science because of their atheism. Both explanations assume a false dichotomy. The true answer probably is very complex, but two factors might explain a lot of it: demographics and personality.

Atheists are predominantly white men. According to Pew, 70% of American atheists are men and 90% are white or Asian, compared to 48% and 74% of the general population, respectively. Scientists [according to NSF] are 72% male and 87% white or Asian, almost identical to atheists. Thus, any random sample of people with the racial and gender makeup of scientists, for that reason alone, should have a higher percentage of atheists than the general population. Among sciences, chemistry and biology employ a relatively high percentage of women -- and, perhaps consequently, a relatively low percentage of atheists.

Personality may explain even more. The Myers-Briggs Type Indicator (MBTI) is probably the most common measure of personality. Though it has several major flaws, it has been used and studied enough to provide useful statistics.

The MBTI type typically associated with scientists is INTJ, whose description closely resembles the stereotypical scientist. Hence, it often is called the "Scientist" type. INTJs generally are analytical, opinionated, and don't believe things without "cold hard facts" -- a common description of atheists as well. In a survey of 10,627 American atheists, 13.7% were classified as INTJ, compared to only 2.6% of the general population.

A few other personality types are typically associated with scientists. By far the most common of these is ISTJ, which tend to prefer more practical, applied science than their INTJ cousins. For example, ISTJ has been found to be the most common type among National Weather Service employees. It's also the most common type among atheists. 41.2% of atheists are ISTJ, but only 13.8% of the general population are ISTJ. Thus, a majority (54.9%) of atheists are either ISTJ or INTJ, compared to only 16.5% of the general population.

Personality type can explain the high male-to-female ratio among atheists [and perhaps scientists as well]. Male and female atheists have similar proportions of ISTJ and INTJ (40.8% and 11.7% for female atheists, 41.4% and 14.4% for male atheists). But in the general population these two types are approximately 70% male -- just like atheists and scientists!

Combining demographic and personality data, we can calculate the probability that a random person is an atheist, given basic demographics. For example, starting with a prior probability of 2.0% (the % of atheists in the general population), the probability that a random white male would be an atheist is 3.5%. If the random white male is an INTJ, that probability increases to 14.5%. If we consider random college graduates, it becomes 21.1%.

Extending the calculations to a random group of college graduates (and post-graduates, in parentheses) with the same race and gender makeup as scientists [72% male, 69% white, 18% Asian, etc.], the following percentages are expected to identify as atheist:

INTJ: 20.2% (23.7%)
ISTJ: 11.4% (13.7%)
INTP: 7.4% (9.0%)
ENTJ: 6.0% (7.3%)
INFJ: 5.5% (6.7%)
ISTP: 4.6% (5.6%)
ESTJ: 3.6% (4.4%)
ENTP: 2.8% (3.4%)
ESTP: 2.0% (2.5%)
INFP: 1.7% (2.1%)
ENFJ: 1.5% (1.9%)
ISFJ: 1.5% (1.8%)
ISFP: 0.6% (0.8%)
ENFP: 0.6% (0.7%)
ESFJ: 0.4% (0.5%)
ESFP: 0.3% (0.4%)

Thus, if scientists are predominantly INTJ and ISTJ, the 17% who are atheists is similar to that what would be expected from a random sample of people with those personality types and similar basic demographics.

Of course, there are much deeper factors than what these crude statistics represent. Scientists are quite diverse in ways that aren't accounted for here. Not all are INTJ or ISTJ, including myself (an INFJ), and I couldn't find any statistics about that. Correlation doesn't imply causation, and these variables probably aren't completely independent as the equations assume. However, unlike the "science and religion are incompatible" explanation, this one at least has some science to support it.

Saturday, July 5, 2014

Data Sources: Which Books Belong in the Bible?

Last weekend I saw the Book of Mormon musical and it reminded me of a joke I heard on a Jewish radio show: "Why did God create Mormons?" ... "So that Christians could understand how Jews feel." The joke implies that the New Testament is analogous to the Book of Mormon, which most Christians reject. Ironically, modern Judaism has its own "new testament", the Talmud, which also is analogous in some ways. In fact, many of the differences between various religions can be attributed to differences in the holy books they consider authoritative (i.e., their "bible" canon).

So which books belong in the Bible?

I think that is the wrong question to ask, and I think it comes from the natural (but often irrational) human desire for certainty, facilitated by concrete, black-and-white category distinctions. The problem is that even if the writings themselves are divinely inspired, inerrant, and infallible, our ability to recognize and classify them as such is not. Thus, instead of regarding particular books as part of an authoritative canon, I think it's more useful to regard all of them as data sources and treat them as such.

Treating them like any other data sources, my answer to "Which books belong in the Bible?" is "as many as can practically fit". That could include the Tanach ("Old Testament"), Apocrypha, New Testament, Talmud, Gnostic writings, Qur'an, Book of Mormon, Bhagavad Gita, Tripitaka, and many others. It includes some that are very accurate and useful, some that are spurious and useless, and some that are largely unreliable yet contain a few useful data points. In other words, it's a lot like the data sources used by scientists (e.g., for things like weather prediction).

Data doesn't have to be perfect to be useful, especially for probabilistic beliefs. Even datasets that partially contradict each other can have value. For example, the New Testament, Talmud, and Qur'an all agree on some things and disagree on others. Thus, we can have a relatively high level of confidence in beliefs & doctrines on which they all agree, and lower confidence in beliefs where they contradict each other. Of course, that in no way implies that they are equally true or should be given equal weight. Not at all.

It's impossible to read every book ever written about God, many of which contain mostly noise. My solution, as with other types of data, is to start with those that are the most accurate (according to history & archaeology), ancient, widely accepted, and relevant, then add more. Using my estimations, that usually means starting with the Torah and Nevi'im ("Law and Prophets" -- which also apparently were Jesus' primary written data sources). They are the most widely accepted and ancient, and they make claims of divine inspiration that can be scientifically tested. If there's room for more data, I then add the Ketuvim ("Writings"), Apocrypha, New Testament, and Mishnah. Then the Essene writings, Jewish Pseudepigrapha, early Jewish writers (Philo, Josephus, Targumim, etc.), Ante-Nicene Fathers, and Gemara. Beyond those, I think the data gets very noisy but still has value in some cases.

This methodology is much different from that of many Christians, Jews, and Muslims, who derive much of their theology from the more recent and less widely accepted books, then interpret (and sometimes translate!!) the Torah and Prophets through the lens of those. That method may still lead to correct theology, but I find it less justifiable from a scientific perspective, and it has a tendency toward circular reasoning.

Thinking of the "Bible" as a collection of data sources also illustrates how unreasonable and unscientific some of the objections to it are. For example, many arguments against belief in God focus on alleged errors and contradictions in the Bible, usually about very insigificant details. Others make a big deal about the rejection of certain non-canonical books and the fact that some canonical books weren't accepted until long after they were supposedly written. Assuming those objections are valid (which is debatable), they're basically equivalent to "a few data points aren't perfect, some of the data contains noise, and you threw out a few data points that maybe you should've kept". In other words, it's like practically every other dataset that scientists rely on.

Saturday, May 17, 2014

Faith and the Overconfidence Effect

I once heard that if you're not 100% sure (about God), you're 100% lost. I couldn't disagree more. I would rephrase it like this: If you are 100% sure, you don't have faith.

Faith does not mean "belief without evidence", but it does imply uncertainty. A good definition of faith is "confident trust despite uncertainty". If you're 100% sure because you have absolute knowledge, you have no need for faith. If you're 100% sure but don't have absolute knowledge, you're self-deluded. If you're just a little less than 100% sure, you're probably under the influence of a pernicious cognitive bias: underestimation of uncertainty, also known as the Overconfidence Effect.

The Overconfidence Effect is a pervasive and well-documented human bias where the level of certainty in one's beliefs is usually much higher than the accuracy of those beliefs. It has been studied by asking people to answer questions (e.g., the spelling of difficult words) and then asking how sure they are that each answer is correct. Those studies found that when people were "100% sure", they were wrong approximately 20% of the time. When 99% sure, they were wrong 40% of the time, and when 90% sure, they were wrong approximately 50% of the time. That should put human certainty into perspective!

I think even 90% is unreasonable for theological beliefs, despite the high certainty that so many believers and atheists seem to have. 90% certainty implies 90% probability. Starting with the principle of indifference, a 90% probability that God exists (or doesn't exist) would require very strong evidence. Though I think there is solid evidence for the God of the Bible, I haven't seen enough for 90% certainty either way. Certainty that a particular religion or systematic theology is the "correct one" would require that plus a lot more. I haven't seen it yet, but that's no reason not to have faith in whichever of the available possibilities is most probable according to the evidence we do have.

I've heard many times, "If God wants us to believe in him, why didn't he give us more evidence?" I think the question totally misunderstands faith and the Bible's message about it. Faith does not mean believing God exists. The God of the Bible really didn't seem to care that people believed he existed. What mattered was whether they trusted his promises and lived in a way that reflected confidence (despite uncertainty) that he would be faithful to those promises. That's very different from the alternatives, such as the faith of atheism (i.e., living in a way that reflects confidence that there is no God and thus no divine promises to be fulfilled). If there was sufficient evidence (or philosophical arguments) to know without any doubt, such choices would mean very little.

Whether we admit it or not, we all have faith because we all make decisions amid uncertainty. Uncertainty isn't such a bad thing. It makes us more humble about our beliefs and more respectful of the beliefs of others, which makes us more open to the truth, in the (very likely) case that we aren't totally correct about everything we believe. Uncertainty also makes faith a lot more meaningful.

Sunday, April 27, 2014

Extraordinary Claims and the Principle of Indifference

You've probably heard the saying "Extraordinary claims require extraordinary evidence." It's the starting point for perhaps the most common argument by atheists: "The existence of God is an extraordinary claim that lacks extraordinary evidence." Seems logical, right? The only problem is, how can we determine whether the evidence (or the claim) really is extraordinary?

One common definition of extraordinary is "very unusual". But the claim that God exists isn't unusual. By that definition, "There is no God" would be a more extraordinary claim. But that standard doesn't always make sense. For example, someone could make the very unusual claim that I'm currently wearing three socks, but most people wouldn't require extraordinary evidence to be convinced. Another common meaning of "extraordinary" is "very remarkable or amazing". That one brings us right back to the original problem: How can we determine how remarkable or amazing a claim is? There are other definitions of "extraordinary" but all are similarly problematic.

A much more scientific way to formulate "Extraordinary claims require extraordinary evidence" is via Bayes' theorem. In Bayesian terms, an extraordinary claim is a hypothesis with a very low prior probability (e.g., “a coin flipped 5 times will land on tails every time”, which has a prior probability of around 3%). It follows that very strong evidence is required to move the probability high enough to believe the claim. Thus, it can be shown mathematically that extraordinary claims (defined this way) do in fact require extraordinary evidence. In the above example, that evidence could be a measurement that the coin's weight is very unbalanced or an observation that it has tails on both sides.

Applying that framework to the God claim, the strength of evidence required depends on a priori assumptions about the prior probability that God exists. Theists who start with a relatively high prior probability require less evidence. Atheists who start with a low prior require more evidence. Arguments about the sufficiency of the evidence for God become circular on both sides. Thus, it's imperative that we have a good, objective way to determine the prior probability.

Because we don't have specific, definite probabilistic information about the God question, we must use an uninformative prior. The simplest and probably most common of these is the principle of indifference, which says the prior probabilities of all hypotheses are equal. In the binary case of “Does God exist?”, the prior is 50%. Starting with a 50% probability may seem crazy if the claim seems ridiculous, but it makes good sense mathematically. The evidence (or lack thereof) is probably what makes such claims seem ridiculous in the first place, and the other terms in Bayes' rule account for that. Also, if the claim seems ridiculous to most people, that fact alone is evidence that would reduce the probability.

Using the principle of indifference, presuppositions about the probability of God's existence are eliminated as determining factors. The estimate of the probability that God exists now depends entirely on the evidence. In this case, “Extraordinary claims require extraordinary evidence” is a meaningless argument. It doesn't matter how extraordinary the claim is because the evidence will tell us whether to believe it. We'll still argue about the evidence and how to assign probabilities to it, but that's a lot more useful than debating a theist's circular argument vs. an atheist's circular argument.

There are other ways to determine uninformative priors, including some that let us use the “extraordinary claims” standard. But when applied to the God claim, they generally require arbitrary assumptions that lead to self-fulfilling conclusions. That might be good enough for testing the claim that I'm wearing three socks right now, but whether or not to believe in God is a much more important question – one that I don't think should be decided (either way) by arbitrary assumptions made before examining the evidence.

Saturday, April 12, 2014

Free Will in the Bible: Overfitting + Confirmation Bias

A major theme of this blog is that we shouldn't force data to answer questions it doesn't actually answer. Overfitting and confirmation bias can have an insidious synergy. I believe the debate over free will in the Bible is one such example. Before I discuss it, I need to define it, because there are two types of free will that people often confuse:

Free will in the legal sense: freedom to make voluntary choices without coercion. In other words, freedom to choose what we want to choose.
Free will in the philosophical sense: the ability to make choices that aren't determined by prior causes. In other words, what we want to choose might be influenced by God, genetics, environment, etc. but aren't completely determined by them.

Another important term is "determinism", which is the idea that all events are caused by prior events or conditions.

Despite some caricatures I've heard, practically everyone agrees that we have free will in the legal sense, so when I say "free will" without a qualifier I'm referring to the philosophical sense. There are 4 main philosophical views of free will:

Hard Determinism: everything happens as a result of what happened before it. Free will is impossible because what we want to choose is determined by prior events & conditions.
Libertarianism: the universe is not deterministic. If it was, we wouldn't have free will. It is possible to make choices that are not determined by prior events & conditions.
Compatibilism: the universe is deterministic but we have free will. Free will only makes sense using the legal definition and it's pointless to talk about philosophical free will.
Hard Incompatibilism: whether the universe is deterministic or not, we wouldn't have free will either way.

People have debated free will for millennia and Bible-believers are no exception. According to Josephus, first century Jews were divided about it. The Essenes were hard determinists who believed that everything was determined by divine fate. The Sadducees were libertarians who denied divine fate and affirmed free will. The Pharisees' view was most similar to Compatibilism. They believed in divine fate for world events but also affirmed free will, particularly in spiritual matters, and their definition of it was more like the legal sense.

Many Christians today are either compatibilist (i.e., Calvinists) or libertarian (i.e., Arminians). Thanks to confirmation bias, it's not surprising that both believe the Bible clearly teaches their view. As readers of this blog might've guessed, I don't believe the Bible writers tried to settle this philosophical debate, so any such interpretation is overfitting. What the Bible does clearly teach is that at least some events are pre-ordained by God and that we make free choices (i.e., we have free will in the legal sense). Those teachings are consistent with all 4 views. Attempts at Bible interpretation on the topic of philosophical free will quickly abandon the original context and inevitably enter the realm of philosophy.

My biases make Hard Determinism (and Compatibilism, which I think is Hard Determinism but afraid to admit it) very attractive to me. Weather is deterministic, and I like to think everything behaves similarly to weather -- maybe because it makes me feel like I have expertise in areas in which I really don't. I see a lot of beauty in deterministic systems, and Chaos Theory provides an excellent answer for why some things appear random or "free". Hard Determinism also is an attractive solution to the problem of evil. If God causes evil, it means evil has a purpose -- a greater good. God doesn't helplessly watch, wishing things were different. Hard Determinism allows for truly divine miracles that don't violate the fundamental laws of nature, demonstrating harmonious consistency in God's interaction with the world. Biological evolution also fits very nicely. And I can feel good when reading the many Bible passages that clearly imply determinism.

I do believe it's the view that is most consistent with the Bible (sorry, Arminian friends), but I must admit that my view is totally based on philosophy, science, and personal bias, not the Bible (sorry, Calvinist friends). What makes me doubt my view is not Libertarian proof texts in the Bible (I have answers for all of them, though not without confirmation bias). What really gives me doubt is quantum mechanics. The more I learn about it, the more I see Hard Incompatibilism and Libertarianism as interesting possibilities.

It's fun to talk about the free will debate as it relates to the God of the Bible. I think the debate would be a better one if we all could admit that it is in fact a philosophical (and perhaps scientific) debate -- one in which the writers of the Bible were not participating.

Sunday, March 30, 2014

The Religiosity of Bigfoot Believers

According to a Gallup survey of over 1700 random people, approximately 17% of the U.S. population believes that creatures such as Bigfoot and the Loch Ness Monster will eventually be discovered by science. Belief in Bigfoot is an interesting way to look at religion, because it is essentially independent of any religious teachings. Bigfoot isn't supernatural. People generally don't believe in Bigfoot because of their religion, and they don't choose their religion according to their belief in Bigfoot.

Compared to people who don't believe in Bigfoot (who I'll call "non-believers"), Bigfoot believers tend to have slightly lower income, slightly less education, and slightly more liberal political ideology, but the differences are fairly small. As one would expect, people who believe in Bigfoot (blue bars) are much more likely than Bigfoot non-believers (tan bars) to believe in a wide variety of things, including some that are supernatural.

Thus, one might also expect that Bigfoot believers would be more likely to believe in God and to be more religious in general. That's partly true. 91% of Bigfoot believers believe in God, compared to 87% of Bigfoot non-believers. They also are slightly more likely to believe in Heaven, Hell, angels, demons, and that Jesus is the Son of God. However, according to a wide variety of metrics, Bigfoot believers are substantially less religiously devout than Bigfoot non-believers.

Despite being slightly more likely to believe that God exists and that Jesus is his son, Bigfoot believers are substantially less likely to identify as Bible believing, born again, evangelical, and fundamentalist than people who don't believe in Bigfoot. They attend religious services, religious education, and prayer meetings less often. They also pray and read religious texts substantially less often than people who don't believe in Bigfoot.

Some people say that religious people are religious because they are gullible and willing to believe things for which there is no compelling scientific evidence. Whether that's true or not may depend on whether Bigfoot is real.

Sunday, March 16, 2014

The Gospel According to a Map

Has the messiah come? Christians say yes, and they usually use the gospels to "prove" it, showing that Jesus fulfilled messianic prophecies. But many of those were about relatively minor details (e.g., where he'd be born) that could've been fabricated by the gospel writers. There were, however, much bigger messianic prophecies, and we can verify them without using holy books of any religion.

Back around 730 BC, God's people were divided: Israel in the north, Judah in the south. Israel was recently conquered and exiled by Assyria, and Judah was headed toward a similar fate via Babylon. Only a few people in Judah believed in God, even fewer in Israel, and practically nobody in the rest of the world. Other nations didn't care about Israel's God, because each one had its own gods. Israel and Judah were reviled and were essentially irrelevant in the world.

The prophet Isaiah offered hope to his people by telling them a king ("messiah") would come and establish a "kingdom" of unprecedented size and strength. Other nations would become followers of Israel's messiah and he would be a moral authority to them (Isa. 2:3). Through him, "the earth would be filled with the knowledge of Israel's God, as the waters cover the sea" (Isa. 11:9). Similar predictions were echoed by other prophets over the next couple centuries, but to no avail. Jerusalem was destroyed in 586 BC and its people exiled. The region was later conquered by Cyrus (Persians) in 539, Alexander the Great (Greeks) in 332, and finally by Pompey (Romans) in 63 BC. It probably seemed like the biblical prophecies would never be fulfilled.

Over 2000 years later, the world looks a lot different. Most of the popular gods of the ancient world (e.g., Baal, Dagon, El, Molech, Asherah, Osiris, Isis, Chemosh, Hadad, Artemis, Zeus, and Caesar) are no longer worshipped. Others are mostly confined to particular regions. But there is one glaring exception. According to polls by Pew Research, the majority of people in the world (55%, including 81% outside of China & India) are, at least nominally, followers of the God of Israel. The following map shows countries (in blue) where the majority of the adult population professes Judaism, Christianity, or Islam as their religion.

As someone who makes over 600,000 weather predictions every day, I know well that correct predictions aren't necessarily evidence of divine revelation. But I also know that consistently accurate predictions always are based on analysis of past data, accurate assessment of current conditions and/or recent trends, a correct understanding of how the universe works, or some combination thereof. These explain how meteorologists can make (somewhat) accurate predictions of future weather, futurists and science fiction writers can predict future inventions, and political analysts can (sometimes) predict the next president. But they don't explain the messianic prophecies.

There is nothing to suggest that the unprecedented events that were prophesied were logical inferences from the available data at the time. Quite the contrary! The data pointed much more toward Israel and Judah being destroyed like most of their neighbors and their God ending up like Baal, Chemosh, Asherah, and the others, as minor footnotes in history.

We don't need to take the Bible's word for it. These are well-attested historical facts, as is the fact that the prophetic books were written long before Jesus was born. It's possible that the messianic prophecies were extremely lucky guesses. Verification of them is not proof of messiahship or divine revelation, and parts of them have not yet been completely fulfilled. But if "evidence" means a body of facts that is more probable if the hypothesis is true than if it is not true, I consider it strong evidence.

Saturday, March 8, 2014

Interpreting the Hebrew Bible with Artificial Intelligence

I often hear the question "Do you interpret the Bible literally or figuratively?" The answer is "both" and "neither", mostly "neither". The Bible contains different genres of writing, which should be interpreted accordingly. They include history, prophecy, poetry/songs, stories, and wisdom literature, to name just a few. Identifying the genre is important, but it can be very subjective. It's also difficult without understanding the original language. Those problems can be solved with machine learning.

I developed an algorithm to interpret the Bible in its original language. I started by writing a Perl script that parses the BHS Hebrew text, removes vowel points, and identifies every word used at least 50 times in the Bible. I also removed stop words (i.e., common irrelevant words such as ani [I], at/atah [you], mah [what], etc.). Keep in mind that in Hebrew some articles & prepositions are prefixes rather than distinct words (e.g., "land" = aretz, "the land" = haaretz, "in the land" = bearetz). The final list included 560 Hebrew words. I calculated the relative frequency of each word (i.e., how often the word is used compared to the other 559 words), then standardized the values. The result was 560 numeric variables, each representing a sufficiently common and sufficiently relevant Hebrew word.

560 variables is too many to easily work with, so I used Principal Component Analysis to reduce it to a few manageable variables, each of which was a linear combination of the standardized relative frequencies of all 560 words. To understand what the principal components mean, I plotted chapters of books of the Bible with obvious/known genres: History (e.g., 1 & 2 Chronicles), Prophesy (e.g., Isaiah), and Wisdom Literature (e.g., Proverbs). Each dot on the graph represents a chapter where the genre of the book (though not necessarily of the chapter) is known.

The first two principal components do an excellent job of separating the books of different genres! The first (PC1) seems to indicate how historical vs. poetic it is. The lowest value (-14.8) is for 2 Chronicles 27, a very historical chapter detailing the reign of king Jotham. The highest value (4.5) is for Psalm 21, a very poetic song. PC2 measures another dimension that (at least in theory) is not related to how historical/poetic a book is. It does a great job of distinguishing between prophecy and wisdom literature. The big outlier among the prophetic books (red triangle on the left side of the blue cluster, at PC1=-9.3, PC2=0.9) happens to be Jeremiah 52, which is a very historical chapter despite being in a prophetic book.

PC1 and PC2 also were calculated for entire books and for chapters/books of unknown genres. Those can be plotted on the same graph to visualize how similar they are to the known genres. For example:

For a more quantitative genre classification, I built a Logistic Regression model using the first 6 principal components. The model estimates the probability that a writing belongs to one of the three broad genres, assuming those are the only three options. As an example, I applied it to each chapter of Genesis and plotted the output below:

According to the model, the first, 3rd, and 15th chapters are by far the least historical, which might disappoint some who interpret Genesis 1 as a scientific or historical narrative. The biggest outlier, however, is chapter 15, which the model thought was very likely prophetic. Indeed, Chapter 15 is about God's covenant with Abram and includes several prophecies about the future.

K-Means Clusters, Hebrew Bible

Classification into these broad genres is only the beginning. If other genres, writing styles, authors, topics, etc. can be identified, another model could easily be built to classify writings according to those, using the same principal components calculated here. If none of those things are known, Cluster analysis can be used to identify writings that have various features in common (see example on the right).

My plan (if I ever get enough free time) is to set up a web page where anyone can easily get the classification values for each chapter of each book. We may never get to a point where computers and algorithms can accurately interpret the Bible for us, but they certainly can be helpful.

Saturday, March 1, 2014

Confirmation Bias: The Bible on Gay Marriage

My favorite desert was the Sahara

When I was about 7 years old I loved deserts. I loved them so much that I wanted to make my own. I found a patch of dirt in the front yard, planted a cactus there, and called it my desert. I wanted to prove that it was a real desert. I knew deserts were hot, so I took a little key chain thermometer, put it in my desert, and left it there for a while in direct sunlight. The thermometer said 120 degrees (F), which would've been a new all-time record high for San Jose, California. I was so excited! I knew it felt more like 80 degrees, and I kinda knew that thermometers aren't supposed to be placed in direct sunlight, but it didn't matter. I had proof that my desert was a real desert!

As a child, I had already mastered confirmation bias. Confirmation bias is the tendency to seek out and accept evidence that confirms what we already believe or want to believe, and reject evidence that contradicts those beliefs. It's so powerful that when participants in controlled experiments are given fabricated evidence (unknown to them that it's fake) specifically designed to disprove their beliefs, they interpret it as actually supporting their beliefs. Bible interpretation is rife with confirmation bias. One example is the debate over same-sex marriage.

Many opponents of gay marriage point to commandments like Leviticus 18:22 & 20:13, which say a man should not "lie" with a man as one would with a woman, as proof that homosexuality is wrong and thus gay marriage should be illegal. Some mention the fact that God created humans male and female and suggest that God defines marriage as between one man and one woman.

Bible-believing proponents of gay marriage have found convenient ways around biblical commandments, such as pointing out commandments against wearing clothes of different fabrics and planting fields with different seeds (both in Leviticus 19) or about eating certain foods like shellfish. Others find support for gay marriage in the words of Jesus. Some even suggest the close relationship between David and Jonathan was more than just friendship.

All of these are examples of overfitting and confirmation bias, which often reinforce each other. People have strong opinions about homosexuality and gay marriage for cultural reasons. That was true before the Bible was written. Currently, the best predictor of one's opinion about gay marriage is age. According to recent polls, 70% of Americans aged 18-29 support it, compared to only 41% who are 65 and older. Even the majority of Republicans aged 18-29 support gay marriage, a higher percentage than Democrats aged 65+. At least half of young evangelical Christians also support same-sex marriage, similar to older non-religious people.

As a strong supporter of gay marriage, I'll inevitably be accused of confirmation bias here, which I can't deny. Nevertheless, I'll try my best to avoid it while answering some common questions.

What does the Bible say about gay marriage?
Nothing. Gay marriage wasn't an issue when/where the Bible was written, so it wasn't directly addressed.

What does the Bible say about homosexuality?
Specifically, nothing. The prohibitions in Leviticus are of a very specific act, one that should not be equated with homosexuality or even practicing homosexuality. Those prohibitions had a very practical purpose at the time (i.e., longevity in the new land), as stated in the same chapters. Whether they still apply is debatable, but it's a different debate. Other mentions of same-sex activity in the Bible involve rape or prostitution, which are similarly condemned for heterosexuals. Even if the Bible did teach that homosexuality is wrong, that has never been the standard for legislation. Idolatry and blasphemy, for example, are clearly wrong according to the Bible, but almost nobody sees the First Amendment as an attack on biblical morality.

What about fabrics, seeds, and shellfish?
In my opinion, this is a bad and unnecessary argument for same-sex marriage. There's a major difference between these and the commandments given in Leviticus 18 & 20. Leviticus 18 was directed at both Israelites and foreigners (Lev. 18:26), but the next chapter (the one that prohibits different fabrics and seeds) was directed only at Israelites (Lev. 19:1), as were the dietary laws. This is an important distinction to Jews, who generally believe that non-Jews are only required to follow the 7 Laws of Noah, which include sexual immorality (Lev. 18) because it also applied to "foreigners", but not the ceremonial and dietary laws or those in Lev. 19. A similar belief was evident among early Christians (see Acts 15).

Does the Bible define marriage exclusively as one man and one woman?
No. Biblical marriage often was between one man and multiple women. Among the many examples are Abraham, Jacob, David, and probably even Moses. Rather than condemning it, the Bible gives provisions for how to treat multiple wives (e.g., do not neglect your first wife after you get another one; Ex. 21:10). Ironically, some Bible-believers argue that gay marriage is a slippery slope that could lead to legalization of polygamy, perhaps forgetting that the Bible allowed it.

Gay marriage is a controversial political and cultural issue. As much as we'd like the Bible to settle all of our political and cultural debates, its authors had a different purpose. That doesn't mean the Bible is irrelevant or that everything is morally neutral unless there's a specific commandment against it. But we need to be very careful to allow the text to speak for itself rather than using it to confirm our own beliefs. The Bible doesn't give direct commandments about everything, but it does give one (also in Leviticus) that summarizes all of them: "love your neighbor has yourself." That one sometimes is forgotten by people on both sides of the gay marriage debate.

Sunday, February 23, 2014

Verifying Torah Model Predictions

Bad models can match the data very well, but good models make accurate predictions. A model's accuracy can be tested by comparing its predictions to an independent set of observations. Some of the Bible's predictions are testable.

The main theme of the Torah is the covenant between God and Israel. Israel is given 10 Commandments and other statutes & ordinances, mostly civil and ceremonial laws. They were told that if they (as a nation) followed them, God would make them thrive in their new land. "Keep his statutes and commandments ... so that it may go well with you and your descendents and that you may enjoy longevity in the land..." (Deut 4:40). If they didn't follow them, the opposite would happen.

Contrary to popular belief, they were not moral rules that one had to follow in order to go to heaven. In fact, an afterlife is never mentioned. Most were practical moral teachings, health and safety regulations, and criminal laws conducive to a successful nation. If the promise of the Torah was true, we should expect that on a large scale (not necessarily each individual person), nations that follow the Torah will thrive.

Much of the rest of the Bible is about how things went well for Israel when they followed the Torah and went badly for them when they didn't. But the Bible is not an appropriate source of verification for predictions in the Bible. Instead, I'll turn to much more current and independent data.

Gallup recently measured subjective well-being (i.e., how people feel about their lives) with the Cantril Self-Anchoring Striving Scale. People were rated as thriving, struggling, or suffering, based on their answers to various questions. I combined Gallup's results with religion data from Pew Research. Each religion was analyzed according to its number of adherents in each country and that country's respective well-being statistics. For example, 62% of Canadians were considered "thriving", so 62% of the 350,000 Jews and 62% of the 710,000 Muslims in Canada were counted as "thriving". The totals for each religion were then divided by their respective number of adherents, so each person counted equally.

This method assumes that the overall moral/religious values of each country are consistent with the proportions of each religion there, which is appropriate given the national (rather than individual) scope of the prediction.

For quantitative metrics, the Legatum Institute has done extensive studies on global wealth and well-being. The Legatum Prosperity Index uses a large number of variables and breaks the results down into several categories (see www.prosperity.com for details). I standardized the scores for each category.

The results are consistent with the Torah's prediction. The Jewish population had the highest scores in every category except Economy. Christians, who affirm (but somewhat disregard) the Torah, had high scores despite relatively weak economies (mostly in Africa and Latin America), and had a high percentage that were "thriving". Muslims, who accept most of the Torah but reject some of the specifics, had low scores overall but relatively high scores compared to their economies. Populations that reject the Torah (Other and Unaffiliated) had low "thriving" percentages and low scores relative to their economies, though Unaffiliated had high scores overall on the quantitative indicators.

We must be very careful not to read too much into these results. There are many possible explanations for them, and a causal relationship has not been established. Nothing has been proven. However, we can reasonably conclude that these results are more probable if the Torah promise is true than if it is false, which means it is positive evidence. If you start with a prior probability near 0% or 100%, evidence probably doesn't matter to you. But if you have a less extreme prior probability, I think this should move it a little in the positive direction.

Monday, February 17, 2014

Weighing Evidence

How good are you at weighing evidence? Here is a simple test:

Suppose random people are tested for a rare (1 in 50,000) disease and one tests positive. The test is 99% accurate, meaning it gives the correct result 99 times out of 100. To be extra sure, he is tested again with the same 99% accurate test. Its result also is positive. Does he have the disease? How sure are you?

If you're thinking “yes” or that it's very likely, you are interpreting the data irrationally. But you're not alone. Psychologists have found that people generally give too much weight to specific, individuating information and not enough weight to general, less-specific information that we perceive as less relevant. The phenomenon is known as the base rate fallacy or base rate neglect. In the above example, there's actually only a 17% chance that he has the disease, but most people are fooled by the 99% accurate tests.

Base rate neglect is common in theology because there is a universe full of data with apparently little relevance, along with religious books that contain very specific individuating data. Though the Bible says God is revealed through nature, evidence from nature is often neglected in theological discussions.

Many theologically conservative Christians focus entirely on the Bible to answer theological questions. They start with their interpretation of the Bible, then interpret the rest of the universe according to what they believe the Bible says. That method is prone to overfitting, which can make it an insidious form of base rate neglect, even if the Bible is 100% true. On the other hand, many theological liberals start with their interpretation of the universe (i.e., worldview), then force the Bible to conform to it and/or reject the parts that don't. That's the extreme opposite of base rate neglect, which is equally irrational. Both use a heavy dose of circular reasoning.

Ironically, many atheists are on the theological conservatives' side here. They start with an interpretation of the Bible (usually a very conservative fundamentalist one that neglects data from outside the Bible), then compare it to their understanding of the universe and conclude that the Bible is morally objectionable and/or contradicted by science. This approach leads to straw man arguments against religion.

My last post discussed the problem of matching beliefs to irrelevant data. This post says we don't give enough weight to data that seems irrelevant. It's not a contradiction, but it's a fine line -- one that's easy to cross in both directions. I cross it all the time. It's a reason why tools like Bayes' rule are so useful. It's also a reason for all of us to be humble about what we believe and don't believe. Our beliefs may seem to perfectly match the most relevant data, but some of them are probably wrong.

Saturday, February 8, 2014

Overfitting the Bible (the Creation/Evolution Debate)

A few years ago I built a machine learning model to predict changes (up, down, or neutral) in Apple's stock price up to 3 hours in advance. When testing it against a past dataset, the model was correct 80% of the time. Cautiously optimistic that I would soon be a billionaire, I tested it with real-time data. The result: it was correct 50% of the time, no better than flipping a coin. So what went wrong?

My model was overfitting the data. Overfitting is when a model models noise (i.e., irrelevant data) rather than the signal (i.e., the true relationship between the input data and the output). Overfit models can superficially match the data very closely but usually don't make accurate predictions.

The above example of overfitting relates the maximum wind gust in Norman, Oklahoma to the number of points given up by the Oklahoma Sooners football team during 2013. The red curve fits the data points (red dots) very well but is not an accurate interpretation of them (keep in mind that some of these games were played hundreds of miles away from Norman). Applying it to the 2014 Sugar Bowl gives a 20 point error. [Wind data courtesy of the Oklahoma Mesonet]

Overfitting is common in Bible-based theology. The Bible contains a lot of data, but the issues people debated 3000 years ago were not necessarily the same as those debated today. For example, Tuesday night I watched a debate between Bill Nye and Ken Ham. Ham used the Bible to argue against the theory of Evolution and for a 6,000 year age of the Earth. But was the book of Genesis written to teach biology and geology to scientists 3000 years later? Is the data relevant or was Mr. Ham overfitting it?

Ken Ham never presented an alternative to evolution because the Bible doesn't contain one. Genesis says God created animals and humans but doesn't explain how. It says the first human ultimately came from the soil, which sounds vaguely like evolution to me, but how soil became man is not explained. The entire process is covered in one sentence with no details. Even less is written about how animals were created. The author clearly wasn't trying to answer the same questions that evolution answers. Thus, using Genesis to deny evolution is overfitting the data. Likewise, using the vast evidence for evolution to deny the truthfulness of the Bible also is overfitting it.

The age of the Earth isn't mentioned in Genesis either. Ham infers it from genealogies and an assumption that the first human was created 5 days after the Earth was created. But does the Bible actually teach that?

Much of the argument is about the word "day". The Hebrew word "yom" usually is translated into English as "day", but it also is translated as age, period, time, lifetime, years, always, forever, eternity, and several other words. Hebrew has far fewer words than English, so they tend to have a broader meaning, very dependent on the context. So what does "yom" mean in the context of Genesis 1?

I think it probably means "day", but there's a catch. According to most sources, a "day" was understood by ancient near-eastern people as a cycle of lightness and darkness that the sun happened to follow. It was not, as we think of it now, an abstract measure of time equivalent to 24 hours or one rotation of Earth. Genesis 1 mentions three "days" that have "evening and morning", but before the sun was created on day 4. With the sun not yet existing, there's no reason to assume that a "day" had anything to do with the sun or that it was 24 hours long. There's also nothing to suggest that the meaning of "day" changes halfway through the chapter. I'm assuming here that Genesis is a literal description of creation. If it's anything else, there's even less reason to believe it's 24-hour days.

The Young Earth model closely fits a superficial, anachronistic, English reading of the text. In other words, it fits noise. The true test of a model is how well it makes predictions. Like my Apple stock price model and wind gust football model, the Young Earth model doesn't make accurate predictions. As Bill Nye pointed out, what we observe in nature is very different from what the Young Earth model would predict. But as with evolution, using evidence for an old earth as evidence against the Bible is to make the same mistake as Ken Ham.

Many Bible believers use the Bible to answer questions that the Bible doesn't actually address. Many non-believers find in the Bible absurd and immoral teachings that it doesn't actually teach. Both feel confident because their models fit the data, but both are overfitting it. Always remember that fitting the data is far different from accurately interpreting it.

Saturday, February 1, 2014

Defining God via Principal Component Analysis

Discussing God can be difficult because "God" can mean very different things to different people. It would be nice to define God in a simple way that covers all possibilities, and even better if we could do so mathematically.

I'll start with a few examples. This list is very incomplete but represents data points that have a large variance:

New Age: God is an impersonal life force, the incorporeal formless cosmic order personified within all people and matter.
Fundamentalist: God is a personal, supernatural being who has directly interacted with humanity. Many of his characteristics and desires were known to ancient people and described literally and in detail by ancient texts. He created humans supernaturally and not through evolution.
Impersonal First Cause: Whatever caused the Big Bang is God.
Highly Evolved: God is a result of evolution via natural processes, but for a much longer period of time than humans, reaching a technically finite but practically infinite level of knowledge and ability.
Natural Eternal: God is a personal being who has always existed but does not (or cannot) violate the known laws of nature. He created humans via a natural process of evolution.
Supernatural Omnipotent: God is an omnipotent personal being who is not bound by the known laws of nature. He only interacts with humanity on rare occasions or in indirect ways. He created humans by supernaturally guiding a natural process of evolution.

Now I'll get more quantitative. I rated the example definitions of God on a 0-10 scale for a variety of attributes:

Specificity: how specifically God is defined. 0 = God can be described only in vague generalities at best. 10 = God can be described in very specific detail.
Impact: how much God affects us. 0 = It makes no practical difference whether God exists or not. 10 = God interacts with us and can directly affect many aspects of our lives.
Ability: how powerful and knowledgeable God is. 0 = no ability or knowledge. 10 = omnipotent and omniscient.
Supernaturality: how supernatural God is. 0 = completely bound by natural laws. 10 = often violates known natural laws.
Knowability: how well we can know God. 0 = completely mysterious and distant. 10 = We can have a close personal relationship with God.

It's possible to represent all of these attributes with a single variable by applying Principal Component Analysis (PCA). PCA is a statistical method for reducing the dimensionality of a dataset (i.e., reducing the number of variables needed to represent most of the variance in the data). It combines multiple variables into new variables that are linear combinations of the original variables.

The first principal component is the new (combined) variable that explains the most variance in the data. In this case, it is equal to 0.38*Specificity + 0.46*Impact + 0.48*Ability + 0.42*Supernaturality + 0.49*Knowability. Some variables have lower coefficients (e.g., Specificity), not because they're less important but because they're not as strongly correlated with the other variables. To convert the first principal component to a 0-10 scale, divide by 2.23. I'll call the resulting value the "God Index".

To demonstrate the meaning and usefulness of the God Index, I defined a new variable, also on a 0-10 scale:

Prior Believability: how easy it is to believe in God without a lot of evidence (analogous to the Bayesian prior probability). 0 = It's impossible that God exists. 10 = It's a known fact that God exists. 1 = God seems very implausible. Belief in him is possible but would require extraordinary evidence. 9 = It is obvious that God exists and difficult to believe otherwise.

Here is a plot of Prior Believability as a function of God Index:

The New Age God I defined is an outlier, but keep in mind that a God Index of 2.7 does not in any way imply New Age. It only means a God with somewhat low values of the attributes I defined, of which my New Age God is merely one example. With the outlier removed, I fit a logistic curve (the dotted line) to the data. The formula is (where G = God Index):

The formula can be used to derive a probability density function that allows us to apply Bayes' theorem with a continuous prior probability that covers all possible definitions of God.

Because Prior Believability is analogous to Bayesian prior probability, the above equation (and graph) helps explain the futility of most debates between atheists and theists. Atheists usually define God with a very high God Index (in the 9-10 range), while most theist arguments apply to a God in the 1-3 range (even though they may be arguing for a specific Christian God with a value around 8). It follows mathematically that their thresholds for what would constitute sufficient evidence for a God are vastly different!

Personally, I'm not very interested in a God in the 0-3 God Index range, because such a God would be practically irrelevant. What interests me most is the God of First Century Judaism. The reason is simple: all three major monotheistic religions (Judaism, Christianity, and Islam) are rooted in First Century Judaism, and all three affirm that the true God lies within it.

There was a wide range of beliefs among First Century Jews, but all affirmed the Torah (first 5 books of the Tanach/"Old Testament") as foundational, so that's where I'll begin. Before considering specific evidence, I'd estimate that the God of the Torah is probably in the 5-9 God Index range, perhaps with a peak (of the probability distribution) around 7. Applying my 5% rule (i.e., don't assign probabilities outside the 0.05-0.95 [5%-95%] range without bulletproof reasons), I get the following probability distribution (which should change as new evidence is considered):

Of course, there are other dimensions of God definitions that are not captured by the God Index, and I'll eventually get to those; but I think the God Index adequately represents the most essential differences between most concepts of God. It's imperfect but useful. It's a good alternative to single definitions that almost nobody can agree on, and it works very well with Bayes' theorem. Instead of investigating single concepts of God, now we can investigate all of them at the same time.

Saturday, January 25, 2014

The Essential Equation of Theology

Most theological questions cannot be answered by direct observation. Some might be unknowable. Many require understanding an ancient language and culture that nobody fully understands. Nearly all involve uncertainty. They are matters of belief rather than knowledge. In other words, most theological questions require probabilistic answers. Unfortunately, the human brain is not very good at processing probability and uncertainty.

Fortunately, there is a practical solution: Bayes' theorem. It is a mathematically valid method to calculate probability when there is uncertainty in the data. The formula (color-coded to help you keep track of the terms) is:

Where:

P(H|E) is the probability that hypothesis H is true, given evidence E
P(E|H) is the probability that E would be observed if H is true
P(H) is the prior probability that H is true, without considering E
P(E|¬H) is the probability that E would be observed if H is not true

Bayes' theorem has important implications for theology. It suggests we should adjust our beliefs whenever we learn new evidence. It also implies that the way many of us interpret the data is wrong. Instead of asking "What (if anything) does the evidence prove?", which does not account for uncertainty and can lead to erroneous conclusions, Bayes' theorem implies that we should instead ask 3 questions:

P(H): What is the probability that our belief is true without considering the new evidence?
P(E|H): What is the probability that the new evidence would be what it is if our belief is true?
P(E|¬H): What is the probability that the new evidence would be what it is if our belief is not true?

The probability that the belief is true is adjusted whenever new evidence is considered. It increases if the answer to #2 is larger than the answer to #3 and decreases if #3 is larger than #2. If #2 and #3 are the same, the data (not really "evidence" in that case) does not move the original probability (#1).

Let's apply Bayes' theorem to a basic theological belief: "God exists". For this example, I'll define "God" only as a personal being who created the universe.

The first evidence to consider is that all relevant observations indicate that our universe had a beginning (which is the scientific consensus). To answer the 3 Bayesian questions:

P(H): A prior probability of 0% or 100% would be circular and would neglect uncertainty. 50% seems too high when no specific evidence has been considered yet, so I'll use 10%. It's somewhat arbitrary, but if enough evidence is considered, what we use for P(H) shouldn't matter.
P(E|H): If God is the creator of the universe, the probability is very high that the observations would indicate the universe had a beginning. I don't trust my mind enough to use probabilities above 95%, so I'll call it 95%.
P(E|¬H): If there is no God, this is a more difficult question with a high level of uncertainty. I can't go too high because it would seem to defy the First Law of Thermodynamics. However, I've heard some interesting theories that don't seem entirely implausible. I'll go with 25%.

Plugging into the equation, P(H|E) = 0.95*0.10/(0.95*0.10 + 0.25*(1-0.10)) = 0.297, the probability that God exists becomes approximately 29.7%.

Now let's consider negative evidence: the current lack of any direct observations of a God. The prior probability is now the previous result: 29.7%. If there is a God, a lack of any direct observations of him may or may not be probable, depending on what kind of God it is. I'll say 50%. If God does not exist, a lack of direct observation is almost certain. I'll again use my 95% rule. The result, P(H|E) = 0.50*0.297/(0.50*0.297 + 0.95*(1-0.297)) = 0.182, is an updated probability of 18.2%.

Finally, let's consider neutral evidence: religious writings contain apparent errors and contradictions. If God exists, it's still highly probable that religious writings would contain apparent errors and contradictions, whether real or perceived. P(E|H) = 90%. The same would be true if there is no God. P(E|¬H) = 90%. The result, P(H|E) = 0.90*0.182/(0.90*0.182 + 0.90*(1-0.182)) = 0.182, 18.2%, no change.

This process should be repeated until all data is considered.

There is much more (and in my opinion, much better) evidence to consider, but my point here is the thought process, not the numbers. We can disagree about what the numbers should be, but if that's what we're debating, we've come a long way. It would mean we're asking the right questions and analyzing the data in a way that properly accounts for uncertainty.

Sunday, January 19, 2014

Introduction

Why are some people so sure there is a God and others so sure there isn't? How can a particular interpretation of the Bible seem so obvious and logical to one person but so obviously wrong to another?

Seeing Patterns in Noise:
The Virgin Mary on grilled cheese

Humans have an amazing ability to recognize patterns in data. We have an equally amazing ability to recognize patterns in meaningless, random noise. We also are very good at making generalizations from insufficient data, jumping to facile "black or white" conclusions, and filtering out data that conflicts with what we already believe or want to believe. These represent only a few of the many cognitive biases that impair our ability to interpret data.

We are especially bad at interpreting data in the context of theology. If you need proof, look carefully at both sides of almost any “Atheism vs. Christianity” debate on the Internet. The problem is not lack of data. Quite the opposite! There is so much relevant data that it's impossible for anyone to adequately grasp it. Thus, we tend to focus on a narrow subset of data and let our cognitive biases take care of the rest.

I am no less prone to these errors than anyone else. But I do know some tools from my line of work that are helpful in minimizing our biases and extracting useful information from large, complex datasets. This blog will apply data mining concepts (along with some psychology, meteorology, and personal opinion) to challenge people of all faiths (or lack thereof) to look at theological “data” in a new way.

Thanks for reading!