Saturday, February 1, 2014

Defining God via Principal Component Analysis

Discussing God can be difficult because "God" can mean very different things to different people. It would be nice to define God in a simple way that covers all possibilities, and even better if we could do so mathematically.

I'll start with a few examples. This list is very incomplete but represents data points that have a large variance:
  • New Age: God is an impersonal life force, the incorporeal formless cosmic order personified within all people and matter.
  • Fundamentalist: God is a personal, supernatural being who has directly interacted with humanity. Many of his characteristics and desires were known to ancient people and described literally and in detail by ancient texts. He created humans supernaturally and not through evolution.
  • Impersonal First Cause: Whatever caused the Big Bang is God.
  • Highly Evolved: God is a result of evolution via natural processes, but for a much longer period of time than humans, reaching a technically finite but practically infinite level of knowledge and ability.
  • Natural Eternal: God is a personal being who has always existed but does not (or cannot) violate the known laws of nature. He created humans via a natural process of evolution.
  • Supernatural Omnipotent: God is an omnipotent personal being who is not bound by the known laws of nature. He only interacts with humanity on rare occasions or in indirect ways. He created humans by supernaturally guiding a natural process of evolution.

Now I'll get more quantitative. I rated the example definitions of God on a 0-10 scale for a variety of attributes:
  • Specificity: how specifically God is defined. 0 = God can be described only in vague generalities at best. 10 = God can be described in very specific detail.
  • Impact: how much God affects us. 0 = It makes no practical difference whether God exists or not. 10 = God interacts with us and can directly affect many aspects of our lives.
  • Ability: how powerful and knowledgeable God is. 0 = no ability or knowledge. 10 = omnipotent and omniscient.
  • Supernaturality: how supernatural God is. 0 = completely bound by natural laws. 10 = often violates known natural laws.
  • Knowability: how well we can know God. 0 = completely mysterious and distant. 10 = We can have a close personal relationship with God.

It's possible to represent all of these attributes with a single variable by applying Principal Component Analysis (PCA). PCA is a statistical method for reducing the dimensionality of a dataset (i.e., reducing the number of variables needed to represent most of the variance in the data). It combines multiple variables into new variables that are linear combinations of the original variables.

The first principal component is the new (combined) variable that explains the most variance in the data. In this case, it is equal to 0.38*Specificity + 0.46*Impact + 0.48*Ability + 0.42*Supernaturality + 0.49*Knowability. Some variables have lower coefficients (e.g., Specificity), not because they're less important but because they're not as strongly correlated with the other variables. To convert the first principal component to a 0-10 scale, divide by 2.23. I'll call the resulting value the "God Index".

To demonstrate the meaning and usefulness of the God Index, I defined a new variable, also on a 0-10 scale:
  • Prior Believability: how easy it is to believe in God without a lot of evidence (analogous to the Bayesian prior probability). 0 = It's impossible that God exists. 10 = It's a known fact that God exists. 1 = God seems very implausible. Belief in him is possible but would require extraordinary evidence. 9 = It is obvious that God exists and difficult to believe otherwise.

Here is a plot of Prior Believability as a function of God Index:

The New Age God I defined is an outlier, but keep in mind that a God Index of 2.7 does not in any way imply New Age. It only means a God with somewhat low values of the attributes I defined, of which my New Age God is merely one example. With the outlier removed, I fit a logistic curve (the dotted line) to the data. The formula is (where G = God Index):

The formula can be used to derive a probability density function that allows us to apply Bayes' theorem with a continuous prior probability that covers all possible definitions of God.

Because Prior Believability is analogous to Bayesian prior probability, the above equation (and graph) helps explain the futility of most debates between atheists and theists. Atheists usually define God with a very high God Index (in the 9-10 range), while most theist arguments apply to a God in the 1-3 range (even though they may be arguing for a specific Christian God with a value around 8). It follows mathematically that their thresholds for what would constitute sufficient evidence for a God are vastly different!

Personally, I'm not very interested in a God in the 0-3 God Index range, because such a God would be practically irrelevant. What interests me most is the God of First Century Judaism. The reason is simple: all three major monotheistic religions (Judaism, Christianity, and Islam) are rooted in First Century Judaism, and all three affirm that the true God lies within it.

There was a wide range of beliefs among First Century Jews, but all affirmed the Torah (first 5 books of the Tanach/"Old Testament") as foundational, so that's where I'll begin. Before considering specific evidence, I'd estimate that the God of the Torah is probably in the 5-9 God Index range, perhaps with a peak (of the probability distribution) around 7. Applying my 5% rule (i.e., don't assign probabilities outside the 0.05-0.95 [5%-95%] range without bulletproof reasons), I get the following probability distribution (which should change as new evidence is considered):

Of course, there are other dimensions of God definitions that are not captured by the God Index, and I'll eventually get to those; but I think the God Index adequately represents the most essential differences between most concepts of God. It's imperfect but useful. It's a good alternative to single definitions that almost nobody can agree on, and it works very well with Bayes' theorem. Instead of investigating single concepts of God, now we can investigate all of them at the same time.

