Sunday, February 23, 2014

Verifying Torah Model Predictions

Bad models can match the data very well, but good models make accurate predictions. A model's accuracy can be tested by comparing its predictions to an independent set of observations. Some of the Bible's predictions are testable.

The main theme of the Torah is the covenant between God and Israel. Israel is given 10 Commandments and other statutes & ordinances, mostly civil and ceremonial laws. They were told that if they (as a nation) followed them, God would make them thrive in their new land. "Keep his statutes and commandments ... so that it may go well with you and your descendents and that you may enjoy longevity in the land..." (Deut 4:40). If they didn't follow them, the opposite would happen.

Contrary to popular belief, they were not moral rules that one had to follow in order to go to heaven. In fact, an afterlife is never mentioned. Most were practical moral teachings, health and safety regulations, and criminal laws conducive to a successful nation. If the promise of the Torah was true, we should expect that on a large scale (not necessarily each individual person), nations that follow the Torah will thrive.

Much of the rest of the Bible is about how things went well for Israel when they followed the Torah and went badly for them when they didn't. But the Bible is not an appropriate source of verification for predictions in the Bible. Instead, I'll turn to much more current and independent data.

Gallup recently measured subjective well-being (i.e., how people feel about their lives) with the Cantril Self-Anchoring Striving Scale. People were rated as thriving, struggling, or suffering, based on their answers to various questions. I combined Gallup's results with religion data from Pew Research. Each religion was analyzed according to its number of adherents in each country and that country's respective well-being statistics. For example, 62% of Canadians were considered "thriving", so 62% of the 350,000 Jews and 62% of the 710,000 Muslims in Canada were counted as "thriving". The totals for each religion were then divided by their respective number of adherents, so each person counted equally.

This method assumes that the overall moral/religious values of each country are consistent with the proportions of each religion there, which is appropriate given the national (rather than individual) scope of the prediction.


For quantitative metrics, the Legatum Institute has done extensive studies on global wealth and well-being. The Legatum Prosperity Index uses a large number of variables and breaks the results down into several categories (see for details). I standardized the scores for each category.

The results are consistent with the Torah's prediction. The Jewish population had the highest scores in every category except Economy. Christians, who affirm (but somewhat disregard) the Torah, had high scores despite relatively weak economies (mostly in Africa and Latin America), and had a high percentage that were "thriving". Muslims, who accept most of the Torah but reject some of the specifics, had low scores overall but relatively high scores compared to their economies. Populations that reject the Torah (Other and Unaffiliated) had low "thriving" percentages and low scores relative to their economies, though Unaffiliated had high scores overall on the quantitative indicators.

We must be very careful not to read too much into these results. There are many possible explanations for them, and a causal relationship has not been established. Nothing has been proven. However, we can reasonably conclude that these results are more probable if the Torah promise is true than if it is false, which means it is positive evidence. If you start with a prior probability near 0% or 100%, evidence probably doesn't matter to you. But if you have a less extreme prior probability, I think this should move it a little in the positive direction.

Monday, February 17, 2014

Weighing Evidence

How good are you at weighing evidence? Here is a simple test:

Suppose random people are tested for a rare (1 in 50,000) disease and one tests positive. The test is 99% accurate, meaning it gives the correct result 99 times out of 100. To be extra sure, he is tested again with the same 99% accurate test. Its result also is positive. Does he have the disease? How sure are you?

If you're thinking “yes” or that it's very likely, you are interpreting the data irrationally. But you're not alone. Psychologists have found that people generally give too much weight to specific, individuating information and not enough weight to general, less-specific information that we perceive as less relevant. The phenomenon is known as the base rate fallacy or base rate neglect. In the above example, there's actually only a 17% chance that he has the disease, but most people are fooled by the 99% accurate tests.

Base rate neglect is common in theology because there is a universe full of data with apparently little relevance, along with religious books that contain very specific individuating data. Though the Bible says God is revealed through nature, evidence from nature is often neglected in theological discussions.

Many theologically conservative Christians focus entirely on the Bible to answer theological questions. They start with their interpretation of the Bible, then interpret the rest of the universe according to what they believe the Bible says. That method is prone to overfitting, which can make it an insidious form of base rate neglect, even if the Bible is 100% true. On the other hand, many theological liberals start with their interpretation of the universe (i.e., worldview), then force the Bible to conform to it and/or reject the parts that don't. That's the extreme opposite of base rate neglect, which is equally irrational. Both use a heavy dose of circular reasoning.

Ironically, many atheists are on the theological conservatives' side here. They start with an interpretation of the Bible (usually a very conservative fundamentalist one that neglects data from outside the Bible), then compare it to their understanding of the universe and conclude that the Bible is morally objectionable and/or contradicted by science. This approach leads to straw man arguments against religion.

My last post discussed the problem of matching beliefs to irrelevant data. This post says we don't give enough weight to data that seems irrelevant. It's not a contradiction, but it's a fine line -- one that's easy to cross in both directions. I cross it all the time. It's a reason why tools like Bayes' rule are so useful. It's also a reason for all of us to be humble about what we believe and don't believe. Our beliefs may seem to perfectly match the most relevant data, but some of them are probably wrong.

Saturday, February 8, 2014

Overfitting the Bible (the Creation/Evolution Debate)

A few years ago I built a machine learning model to predict changes (up, down, or neutral) in Apple's stock price up to 3 hours in advance. When testing it against a past dataset, the model was correct 80% of the time. Cautiously optimistic that I would soon be a billionaire, I tested it with real-time data. The result: it was correct 50% of the time, no better than flipping a coin. So what went wrong?

My model was overfitting the data. Overfitting is when a model models noise (i.e., irrelevant data) rather than the signal (i.e., the true relationship between the input data and the output). Overfit models can superficially match the data very closely but usually don't make accurate predictions.
The above example of overfitting relates the maximum wind gust in Norman, Oklahoma to the number of points given up by the Oklahoma Sooners football team during 2013. The red curve fits the data points (red dots) very well but is not an accurate interpretation of them  (keep in mind that some of these games were played hundreds of miles away from Norman). Applying it to the 2014 Sugar Bowl gives a 20 point error. [Wind data courtesy of the Oklahoma Mesonet]
Overfitting is common in Bible-based theology. The Bible contains a lot of data, but the issues people debated 3000 years ago were not necessarily the same as those debated today. For example, Tuesday night I watched a debate between Bill Nye and Ken Ham. Ham used the Bible to argue against the theory of Evolution and for a 6,000 year age of the Earth. But was the book of Genesis written to teach biology and geology to scientists 3000 years later? Is the data relevant or was Mr. Ham overfitting it?

Ken Ham never presented an alternative to evolution because the Bible doesn't contain one. Genesis says God created animals and humans but doesn't explain how. It says the first human ultimately came from the soil, which sounds vaguely like evolution to me, but how soil became man is not explained. The entire process is covered in one sentence with no details. Even less is written about how animals were created. The author clearly wasn't trying to answer the same questions that evolution answers. Thus, using Genesis to deny evolution is overfitting the data. Likewise, using the vast evidence for evolution to deny the truthfulness of the Bible also is overfitting it.

The age of the Earth isn't mentioned in Genesis either. Ham infers it from genealogies and an assumption that the first human was created 5 days after the Earth was created. But does the Bible actually teach that?

Much of the argument is about the word "day". The Hebrew word "yom" usually is translated into English as "day", but it also is translated as age, period, time, lifetime, years, always, forever, eternity, and several other words. Hebrew has far fewer words than English, so they tend to have a broader meaning, very dependent on the context. So what does "yom" mean in the context of Genesis 1?

I think it probably means "day", but there's a catch. According to most sources, a "day" was understood by ancient near-eastern people as a cycle of lightness and darkness that the sun happened to follow. It was not, as we think of it now, an abstract measure of time equivalent to 24 hours or one rotation of Earth. Genesis 1 mentions three "days" that have "evening and morning", but before the sun was created on day 4. With the sun not yet existing, there's no reason to assume that a "day" had anything to do with the sun or that it was 24 hours long. There's also nothing to suggest that the meaning of "day" changes halfway through the chapter. I'm assuming here that Genesis is a literal description of creation. If it's anything else, there's even less reason to believe it's 24-hour days.

The Young Earth model closely fits a superficial, anachronistic, English reading of the text. In other words, it fits noise. The true test of a model is how well it makes predictions. Like my Apple stock price model and wind gust football model, the Young Earth model doesn't make accurate predictions. As Bill Nye pointed out, what we observe in nature is very different from what the Young Earth model would predict. But as with evolution, using evidence for an old earth as evidence against the Bible is to make the same mistake as Ken Ham.

Many Bible believers use the Bible to answer questions that the Bible doesn't actually address. Many non-believers find in the Bible absurd and immoral teachings that it doesn't actually teach. Both feel confident because their models fit the data, but both are overfitting it. Always remember that fitting the data is far different from accurately interpreting it.

Saturday, February 1, 2014

Defining God via Principal Component Analysis

Discussing God can be difficult because "God" can mean very different things to different people. It would be nice to define God in a simple way that covers all possibilities, and even better if we could do so mathematically.

I'll start with a few examples. This list is very incomplete but represents data points that have a large variance:
  • New Age: God is an impersonal life force, the incorporeal formless cosmic order personified within all people and matter.
  • Fundamentalist: God is a personal, supernatural being who has directly interacted with humanity. Many of his characteristics and desires were known to ancient people and described literally and in detail by ancient texts. He created humans supernaturally and not through evolution.
  • Impersonal First Cause: Whatever caused the Big Bang is God.
  • Highly Evolved: God is a result of evolution via natural processes, but for a much longer period of time than humans, reaching a technically finite but practically infinite level of knowledge and ability.
  • Natural Eternal: God is a personal being who has always existed but does not (or cannot) violate the known laws of nature. He created humans via a natural process of evolution.
  • Supernatural Omnipotent: God is an omnipotent personal being who is not bound by the known laws of nature. He only interacts with humanity on rare occasions or in indirect ways. He created humans by supernaturally guiding a natural process of evolution.

Now I'll get more quantitative. I rated the example definitions of God on a 0-10 scale for a variety of attributes:
  • Specificity: how specifically God is defined. 0 = God can be described only in vague generalities at best. 10 = God can be described in very specific detail.
  • Impact: how much God affects us. 0 = It makes no practical difference whether God exists or not. 10 = God interacts with us and can directly affect many aspects of our lives.
  • Ability: how powerful and knowledgeable God is. 0 = no ability or knowledge. 10 = omnipotent and omniscient.
  • Supernaturality: how supernatural God is. 0 = completely bound by natural laws. 10 = often violates known natural laws.
  • Knowability: how well we can know God. 0 = completely mysterious and distant. 10 = We can have a close personal relationship with God.

It's possible to represent all of these attributes with a single variable by applying Principal Component Analysis (PCA). PCA is a statistical method for reducing the dimensionality of a dataset (i.e., reducing the number of variables needed to represent most of the variance in the data). It combines multiple variables into new variables that are linear combinations of the original variables.

The first principal component is the new (combined) variable that explains the most variance in the data. In this case, it is equal to 0.38*Specificity + 0.46*Impact + 0.48*Ability + 0.42*Supernaturality + 0.49*Knowability. Some variables have lower coefficients (e.g., Specificity), not because they're less important but because they're not as strongly correlated with the other variables. To convert the first principal component to a 0-10 scale, divide by 2.23. I'll call the resulting value the "God Index".

To demonstrate the meaning and usefulness of the God Index, I defined a new variable, also on a 0-10 scale:
  • Prior Believability: how easy it is to believe in God without a lot of evidence (analogous to the Bayesian prior probability). 0 = It's impossible that God exists. 10 = It's a known fact that God exists. 1 = God seems very implausible. Belief in him is possible but would require extraordinary evidence. 9 = It is obvious that God exists and difficult to believe otherwise.

Here is a plot of Prior Believability as a function of God Index:

The New Age God I defined is an outlier, but keep in mind that a God Index of 2.7 does not in any way imply New Age. It only means a God with somewhat low values of the attributes I defined, of which my New Age God is merely one example. With the outlier removed, I fit a logistic curve (the dotted line) to the data. The formula is (where G = God Index):

The formula can be used to derive a probability density function that allows us to apply Bayes' theorem with a continuous prior probability that covers all possible definitions of God.

Because Prior Believability is analogous to Bayesian prior probability, the above equation (and graph) helps explain the futility of most debates between atheists and theists. Atheists usually define God with a very high God Index (in the 9-10 range), while most theist arguments apply to a God in the 1-3 range (even though they may be arguing for a specific Christian God with a value around 8). It follows mathematically that their thresholds for what would constitute sufficient evidence for a God are vastly different!

Personally, I'm not very interested in a God in the 0-3 God Index range, because such a God would be practically irrelevant. What interests me most is the God of First Century Judaism. The reason is simple: all three major monotheistic religions (Judaism, Christianity, and Islam) are rooted in First Century Judaism, and all three affirm that the true God lies within it.

There was a wide range of beliefs among First Century Jews, but all affirmed the Torah (first 5 books of the Tanach/"Old Testament") as foundational, so that's where I'll begin. Before considering specific evidence, I'd estimate that the God of the Torah is probably in the 5-9 God Index range, perhaps with a peak (of the probability distribution) around 7. Applying my 5% rule (i.e., don't assign probabilities outside the 0.05-0.95 [5%-95%] range without bulletproof reasons), I get the following probability distribution (which should change as new evidence is considered):

Of course, there are other dimensions of God definitions that are not captured by the God Index, and I'll eventually get to those; but I think the God Index adequately represents the most essential differences between most concepts of God. It's imperfect but useful. It's a good alternative to single definitions that almost nobody can agree on, and it works very well with Bayes' theorem. Instead of investigating single concepts of God, now we can investigate all of them at the same time.