Saturday, February 8, 2014

Overfitting the Bible (the Creation/Evolution Debate)

A few years ago I built a machine learning model to predict changes (up, down, or neutral) in Apple's stock price up to 3 hours in advance. When testing it against a past dataset, the model was correct 80% of the time. Cautiously optimistic that I would soon be a billionaire, I tested it with real-time data. The result: it was correct 50% of the time, no better than flipping a coin. So what went wrong?

My model was overfitting the data. Overfitting is when a model models noise (i.e., irrelevant data) rather than the signal (i.e., the true relationship between the input data and the output). Overfit models can superficially match the data very closely but usually don't make accurate predictions.
The above example of overfitting relates the maximum wind gust in Norman, Oklahoma to the number of points given up by the Oklahoma Sooners football team during 2013. The red curve fits the data points (red dots) very well but is not an accurate interpretation of them  (keep in mind that some of these games were played hundreds of miles away from Norman). Applying it to the 2014 Sugar Bowl gives a 20 point error. [Wind data courtesy of the Oklahoma Mesonet]
Overfitting is common in Bible-based theology. The Bible contains a lot of data, but the issues people debated 3000 years ago were not necessarily the same as those debated today. For example, Tuesday night I watched a debate between Bill Nye and Ken Ham. Ham used the Bible to argue against the theory of Evolution and for a 6,000 year age of the Earth. But was the book of Genesis written to teach biology and geology to scientists 3000 years later? Is the data relevant or was Mr. Ham overfitting it?

Ken Ham never presented an alternative to evolution because the Bible doesn't contain one. Genesis says God created animals and humans but doesn't explain how. It says the first human ultimately came from the soil, which sounds vaguely like evolution to me, but how soil became man is not explained. The entire process is covered in one sentence with no details. Even less is written about how animals were created. The author clearly wasn't trying to answer the same questions that evolution answers. Thus, using Genesis to deny evolution is overfitting the data. Likewise, using the vast evidence for evolution to deny the truthfulness of the Bible also is overfitting it.

The age of the Earth isn't mentioned in Genesis either. Ham infers it from genealogies and an assumption that the first human was created 5 days after the Earth was created. But does the Bible actually teach that?

Much of the argument is about the word "day". The Hebrew word "yom" usually is translated into English as "day", but it also is translated as age, period, time, lifetime, years, always, forever, eternity, and several other words. Hebrew has far fewer words than English, so they tend to have a broader meaning, very dependent on the context. So what does "yom" mean in the context of Genesis 1?

I think it probably means "day", but there's a catch. According to most sources, a "day" was understood by ancient near-eastern people as a cycle of lightness and darkness that the sun happened to follow. It was not, as we think of it now, an abstract measure of time equivalent to 24 hours or one rotation of Earth. Genesis 1 mentions three "days" that have "evening and morning", but before the sun was created on day 4. With the sun not yet existing, there's no reason to assume that a "day" had anything to do with the sun or that it was 24 hours long. There's also nothing to suggest that the meaning of "day" changes halfway through the chapter. I'm assuming here that Genesis is a literal description of creation. If it's anything else, there's even less reason to believe it's 24-hour days.

The Young Earth model closely fits a superficial, anachronistic, English reading of the text. In other words, it fits noise. The true test of a model is how well it makes predictions. Like my Apple stock price model and wind gust football model, the Young Earth model doesn't make accurate predictions. As Bill Nye pointed out, what we observe in nature is very different from what the Young Earth model would predict. But as with evolution, using evidence for an old earth as evidence against the Bible is to make the same mistake as Ken Ham.

Many Bible believers use the Bible to answer questions that the Bible doesn't actually address. Many non-believers find in the Bible absurd and immoral teachings that it doesn't actually teach. Both feel confident because their models fit the data, but both are overfitting it. Always remember that fitting the data is far different from accurately interpreting it.


  1. Hey Matt,

    I appreciate the article! I especially like that you mentioned we don't know how old the earth was before God began filling it. Genesis 1:2 The earth was without form, and void; and darkness was on the face of the deep. And the Spirit of God was hovering over the face of the waters.

    It's not often anyone acknowledges that.

    But one thing is that starting with the first day of creation, Genesis mentions that there was an evening then a morning, which would sound like it was the same span of time as we would relate a day to be. I obviously can't prove it 100% as I wasn't there to observe, but it seems implied. Your thoughts?

    1. Thanks for the comment! I don't think "evening and morning" implies 24 hours in the context of Genesis 1. The only reason it implies that to us is that we know evening and morning are related to sunset, sunrise, and the earth's rotation, which occurs on a 24 hour cycle. Without the sun, that wouldn't happen on a 24 hour cycle. It's possible the author of Genesis didn't make that connection (many anthropologists believe ancient people didn't know that daylight was caused by the sun), but in that case the only way 24 hours would be implied is if the author used evening, morning, day as abstract units of time measurement. I think that would be reading into the text a lot that isn't there. We need to be very careful not to force our concept of time measurement onto ancient writers who probably didn't think of it the same way we do.

  2. The fact that the ancient writers might have had some other intent with the terms "morning" and "evening" leaves completely open a vast range of possibilities. All well and good, but we'll never be able to ask them to explain what they meant. The implication of a solar cycle prior to the 'creation' of the sun is simply an apparent mistake/contradiction. In the absence of a definitive answer regarding the intent of the writers, it will always remain an apparent error.

    Can you provide some examples where "Many non-believers find in the Bible absurd and immoral teachings that it doesn't actually teach."?

    Obviously, overfitting the bible is a possible way to avoid the issues that arise from taking it literally. But in the final analysis, most everyone (including you and I) has their own interpretation of the bible. And among the believers, they insist that their interpretation is the one true interpretation ...

    1. Thanks for the comment, Chuck. It's true that we can't ask them what they meant, but we can study their language and culture to get the closest possible understanding. My statement that a solar cycle was NOT implied was based on the way ancient people understood day/night according to anthropologists who study those things.

      For a good list of absurd things that the Bible doesn't actually teach, I recommend

      Overfitting can be a way to avoid issues, but it's usually done to support beliefs that are not actually taught by the original author. I presented a very literal interpretation of Genesis 1. What many people (e.g., Ken Ham and most atheists I've talked to) call a "literal" interpretation often is not literal at all, but rather an anachronistic interpretation that disregards (or is ignorant of) the original language, context, genre, and/or purpose.

      I agree that everyone has their interpretation, but not all interpretations are equally valid.
      I prefer to pay more attention to interpretations by the world's leading Hebrew scholars, anthropologists, and Bible scholars. The random guy at Walmart might have his own interpretation that he thinks is the one true one (just as he might have an opinion about climate change or tornadogenesis), but it wouldn't influence my interpretation nearly as much.

  3. A lot of those things on my website are taken more or less directly from the bible (as it was taught to me). We already have had some lengthy discussions that have included elements of your scholarly study of the original bible in its original language, during which you dispute that what I described there is absurdities the bible didn't actually teach. The bible I knew, as it was interpreted to me, DID teach those things.

    I find it difficult to address most of your arguments because I haven't the background to carry my side of the argument. However, you still seem to be starting from a theist position, which is as biased as is my atheist perspective. All interpretations may not be equally valid, but being a biblical scholar is no guarantee that one's interpretation is correct. After all, scholars disagree, and their consensus isn't inevitably correct. Moreover, I'm sure there are scholars of other religions that have viewpoints that would differ substantially with those of the Abrahamic tradition.

    For instance, you say "According to most sources, a "day" was understood by ancient near-eastern people as a cycle of lightness and darkness that the sun happened to follow." Just how might one measure the passage of a "day" without the sun being present? What sort of strange viewpoint did these borderline barbaric peoples have that they would define a cycle of light and darkness that was somehow independent of the sun? Do you really claim to know what they were thinking? I'm pretty sure that that your sources even mention the cycle of day and night as this weird thing independent of the sun precisely because the bible speaks of it in this apparently errant way - days passing before the creation of the sun. I doubt they would even mention what the ancients thought about the diurnal cycle were it not for this problem.

    1. I understand that it's what was taught to you, as it is for a lot of people, especially in America and in the South. There is a very vocal minority (mostly made up of people who haven't studied it in depth) that does believe a lot of those things, and that's a big part of why I started this blog.

      I agree that my perspective is no less biased than yours. But I don't think it's useful or scientific to look at any belief in such a black & white way. There's a lot that we don't know, which I think we should look at as probability distributions rather than binary yes or no. For example, I agree that the consensus of scholars isn't necessarily correct. It often isn't. I see it as something like a normal (Gaussian) distribution of opinions. I don't know for sure which is the correct one, but the middle of the curve is more probable than the tail ends.

      A lot of ancient writings talk about day/night and the sun, not just the Bible. The "strange viewpoint" that the "borderline barbaric" people of that time/place had was that the sun was a god who travelled across the sky during the day. The beginning of the Bible essentially says "No, the sun is not a god, it's a creation by the one true God." It would've been blasphemy. It was a huge distinctive that really set it apart from the other religions at that time and region. To read it as describing scientific measurements of abstract units of time is to completely miss the point. That's what I mean when I say a "literal" interpretation often means ignoring the original context and language.

  4. Correction: change "...during which you dispute that what I described there is absurdities the bible..." to "...during which you assert that what I described there are absurdities the bible..."

  5. So the sun is denied deity status in the bible - that's as it is, but it doesn't answer what "day" means without being tied to a sun that didn't exist at the start of creation. Surely a reasonably smart deity would have changed the order. And light was created before the sun. It strikes me as a human author following some script that had been provided without regard to its making any sense.

    1. "Surely a reasonably smart deity would have changed the order" -- Why? Only if the writer's purpose was the same as yours, which it clearly was not. It seems like you're trying to force the authors of the Bible to answer your questions, even though they were addressing much different issues. If it was important to them how long, chronologically, a "day" was in that context, they probably would've said so. But from what I can tell, that had nothing to do with the author's point, so there was no reason to explain it.

      Scientists agree that light existed before the sun, so I don't see why that is a problem. I could say that sounds like a reference to the big bang, but I'd probably be overfitting the text at that point. If I was an ancient person who worshipped the sun as my God, I'd probably see it a lot differently.

    2. 1. "Thus, using Genesis to deny evolution is overfitting the data..."
      Genesis can't deny evolution because the primitive culture didn't have the science to understand evolution. Therefore it is not surprising that the text of Genesis cannot give relevant information about Evolution.

      2. "Likewise, using the vast evidence for evolution to deny the truthfulness of the Bible also is overfitting it.."
      Evolution may be a foundational science...but it can not and does not make statements about the Bible.

      3. When you talk about truthfulness...are you talking about intent of the authors or are you talking about inspiration from God? Genesis was apparently a series of ancient "covenants" strung together with historical references to form a coherent whole. I see a diversity of intent in the authorship...but I see absolutely nothing in the diverse scholarship to confirm any sense of supernatural intervention. This must be provided by the indoctrination of faith.

      4. I see no need to dispute your reference to language or context in the interpretation of scripture...that is immaterial to establishing whether the Bible is a group of stories by a diverse set of authors...or the divine words of a supernatural creator. What I can see is a very divided scholarship of events and translations of imperfect texts.

      5. Regardless of your scholarship with respect to language and context... this is immaterial to the vast majority of believers...who are not scholars and not interested in being scholars.

      6. Evolution (as an example) started as a theory and evidence/experimentation built the theory into a factual whole. Evolution didn't originate with questionable ancient wasn't built on covenants (contracts) between tribes ...and it doesn't attempt to use supernatural events to "fill in the gaps". Evolution is falsifiable and open to new evidence....and there are many offshoots from Evolution that have great predictive and practical value.

      I can appreciate the scholarship involved in Biblical history...but when you attempt to use mathematics, statistics, and modern science to establish the Bible as "supernatural truth" will not be credible to a rigorous peer review. Opening this blog to comments is interesting...I will grant you that...but it is not scholarly. For that you will need to submit your work for serious peer review from relevant scholars.

    3. Josh, thanks for the comment. Here is a response to each of your points:

      1. Yes, I agree.

      2. Yes, I agree.

      3. I don't think it's an either/or. Whether the author was divinely inspired or not, his intent is key for interpreting it. By "truthfulness" I just meant true as opposed to false. Yes, there's a great diversity in the book of Genesis, but my post was about the first chapter.

      4. I wasn't trying to establish either one. My point was about how it's misinterpreted, which is the case whether it's human stories or divine words or both.

      5. It agree that a lot of believers (and non-believers) ignore language and context. That's why I'm blogging about it. They don't have to be scholars to not read into the text things that aren't there.

      6. Yes, I agree.

      7. There was no attempt here to establish the Bible as "supernatural truth". Whether it is or isn't, everything in this post still applies. This is a blog post, not a scholarly work or a paper for a scientific journal.