The Cardinal Sin of Data Mining and Data Science: Overfitting

See on Scoop.itWhat is Data Science

Carla Gentry CSPO‘s insight:

We note that Overfitting is not the same as another major data science mistake – "confusing correlation and causation". The difference is that overfitting finds something where there is nothing. In case of "correlation and causation", researchers can find a genuine novel correlation and only discover a cause much later (see a great example from astronomy in Kirk D. Borne interview on Big Data in Astrophysics and Correlation vs. Causality).

See on

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: