How many apples turn you into a biking superstar?On October 27, 2017 by neurotravels
If you imagine a scientist as someone standing in the lab, doing an experiment and becoming all excited when her blue solution in her glass cylinder magically turns green – that is somehow how it works, but it is not the complete big picture. 🙂
Data, Data, Data
In this world of new techniques, big knowledge about tiny genetics or interests in relating behavior of animals to human behavior and vice versa – more than ever what a scientist does, is producing an immense amount of data. If it is numbers, a pattern, an image – a scientist will always have to somehow process and analyse his experiment, present it visually in a type of diagram and, very importantly, perform statistics on it.
Statistics- boring? Bear with me! 😉 Let’s make it fun and easy:
What you do with statistics is basically taking a mathematical model, which comes with its own assumptions and rules – and checking how well your own data fit into this model! Let’s say, for example, you want to compare two groups for their difference: (1) your data for female participants and (2) your data for male participants. To do that, you will need to take a model that is appropriate for comparing two groups of certain criteria and throw your data at it. Then, the model will do it’s funny calculations and in the end will tell you either:
“Oh well yeah wow – your male vs. female group data differ sooo much and sooo consistently, I am confident to say that this is very, very, (very, very) unlikely to be due to chance – there must be something cool going on, there must be a real difference there!”
“Nope, not confident that there is a difference there! Male and females performed in the same way!”
What else is important to know about statistics?
1) it always is a likelihood (never a certainty!) and a scientist will very, very likely never say that something about his data is a set-in-stone and proven certainty
2) with the right software nowadays, the mathematical part of statistics is relatively easy to perform (even without knowing what you are doing), but there is still a choice to be made of which model is best to use for YOUR data! It isn’t always easy to pick the right model that suits your data and then to interpret them in an optimal way.
There would be a lot more to talk about this – but what I want to focus on today is a saying that you may have heard already, and that involves statistics:
correlation is NOT causation.
It is, in my eyes, one of the most important things to know and be critical about, when you hear big science news. What does that saying mean?
Let’s take an example, that I proudly made up myself 😀
Imagine, one happy Sunday, you have decided to start biking to work from tomorrow on (you should – it has amazing benefits!;)). The next day you get up, eat an apple for breakfast and ride to work – it takes you 25 minutes to get there. Woho – you like this, and you want to continue doing it. Every day an apple and a bike ride.
Next Monday, you get up and are super hungry, you eat two apples, ride to work and you realize that it only took you 20 minutes to ride to work now! How cool. The week continues like that – 2 apples and 20 minutes bike ride. Next Monday, you cannot help yourself! This apples are so delicious, you start to eat 3 of them in a row – and as you arrive to work you cannot believe it: it only took you 15 minutes to get here! O M G! You realize, that your apple intake and your bike performance correlate perfectly – the more you eat, the faster you ride to work (= a negative, linear correlation). But does that necessarily mean, that A (apples) is also causative of B (bike faster), that A is the reason for B?
Big, fat NO! You cannot make this statement from a correlation.
To investigate the causation further, you could for example not eat the apples before, but after the bike ride and see how long it will take you to ride to work, or try if the same happens with bananas as a breakfast. You might just have gotten stronger physically within the last 3 weeks, while riding your bike, which then made you faster (and also more hungry 😉 )!
Thus, always pay attention when you hear that X is because of Z, or X is making Z do that. It is never easy to conclude for a definite and only cause, especially if we are talking about complicated biological bodies and systems.
But, our brain again is a bit foolish and does funny stuff to us, like we saw in Your foolish brain: understanding prejudgement. In this case here, we see a part of a confirmation bias: we are evolutionary predisposed to see pattern and to confirm patterns that we have in mind, before we even see the outcome. We want to see, and we sometimes only do see, what we had thought we would see. But, that topic is another post worth! 😉