Tag Archives: probability density function

Uncertainty about Bayesian methods

I have written before about why people find thermodynamics so hard [see my post entitled ‘Why is thermodynamics so hard?’ on February 11th, 2015] so I think it is time to mention another subject that causes difficulty: statistics.  I am worried that just mentioning the word ‘statistics’ will cause people to stop reading, such is its reputation.  Statistics is used to describe phenomena that do not have single values, like the height or weight of my readers.  I would expect the weights of my readers to be a normal distribution, that is they form a bell-shaped graph when the number of readers at each value of weight is plotted as a vertical bar from a horizontal axis representing weight.  In other words, plotting weight along the x-axis and frequency on the y-axis as in the diagram.

The normal distribution has dominated statistical practice and theory since its equation was first published by De Moivre in 1733.  The mean or average value corresponds to the peak in the bell-shaped curve and the standard deviation describes the shape of the bell, basically how fat the bell is.  That’s why we learn to calculate the mean and standard deviation in elementary statistics classes, although often no one tells us this or we quickly forget it.

If all of you told me your weight then I could plot the frequency distribution described above.  And, if I divided the y-axis, the frequency values, by the total number of readers who sent me weight information then the graph would become a probability density distribution [see my post entitled ‘Wind power‘ on August 7th, 2013].  It would tell me the probability that the reader I met last week had a weight of 70.2kg – the probability would be the height of the bell-shaped curve at 70.2kg.  The most likely weight would correspond to the peak value in the curve.

However, I don’t need any of you to send me your weights to be reasonably confident that the weight of the reader I talked to last week was 70.2kg!  I cannot be certain about it but the probability is high.  The reader was female and lived in the UK and according to the Office of National Statistics (ONS) the average weight of women in the UK is 70.2kg – so it is reasonable to assume that the peak in the bell-shaped curve for my female UK readers will coincide with the national distribution, which makes 70.2kg the most probable weight of the reader I met last week.

However, guessing the weight of a reader becomes more difficult if I don’t know where they live or I can’t access national statistics.  The Reverend Thomas Baye (1701-1761) comes to the rescue with the rule named after him.  In Bayesian statistics, I can assume that the probability density distribution of readers’ weight is the same as for the UK population and when I receive some information about your weights then I can update this probability distribution to better describe the true distribution.  I can update as often as I like and use information about the quality of the new data to control its influence on the updated distribution.  If you have got this far then we have both done well; and, I am not going lose you now by expressing Baye’s law in terms of probability, or talking about prior (that’s my initial guess using national statistics) or posterior (that’s the updated one) distributions; because I think the opaque language is one of the reasons that the use of Bayesian statistics has not become widespread.

By the way, I can never be certain about your weight; even if you tell me directly, because I don’t know whether your scales are accurate and whether you are telling the truth!  But that’s a whole different issue!

Benford’s law

We need to learn to think big.  Humans have had a tendency to underestimate the scale of everything that exists.  We have progressed at an increasing rate from believing the earth was the focus of  existence, to understanding that our planet orbits the sun together with a group of other planets, to appreciating that our sun is a tiny speck in a galaxy that we call the Milky Way that is part of a universe and possibly a multiverse.  We have been able to spot mathematical patterns in nature and to describe them using the equations of physics that in turn allow us to predict the existence of phenomena before we have observed them, such as the Higgs-Boson, and also allow us to harness nature to provide goods and services to society.  The former is the role of physicists and the latter of engineers.  So there is a close link between physicists and engineers and it is not unusual to find engineers working in physics labs and physicists working in engineering organisations.  Frank Benford was a physicist working at General Electric in 1938 when he proposed a law that bears his name, though it has also been credited to Simon Newcomb, an astronomer working 50 years earlier.

Benford’s law predicts the frequency with which the numbers from 1 to 9 will appear as the first digit in a collection of numbers from a real-life source.  The frequency declines logarithmically from 30.1% for 1,  17.6% for 2, 12.5% for 3 etc down to 4.6% for 9.  It is probability distribution so you should not expect see the distribution for every collection of numbers but when it does not appear then you should be suspicious about the provenance of the data, particularly when it does not appear repeatedly.  It is used routinely by accountants and is being used increasingly to identify potential scientific fraud.  Of course some people think big and know about Benford’s law, for instance the fraudster Bernard Madoff filed Benford-compatible monthly returns, which perhaps is one reason why it took so long to catch him.

BTW – Benford’s law does not work for reciprocals or square roots, but is does for powers of 2, factorials and the Fibonacci sequence.



Big Bang to Little Swoosh by Max Tegmark, New York Times, April 11th, 2014.

Look out for No.1. by Tim Harford in the Financial Times, September 9th, 2011.




Wind power

Winds are generated by uneven heating of the earth’s atmosphere by the sun, which causes hotter, less dense air to rise and more dense, colder air to be pulled into replace it.  Of course, land masses, water evaporation over oceans, and the rotation of the earth amongst other things added to the complexity of weather systems.  However, essentially weather systems are driven by natural convection, a form of heat or energy transfer, as I hinted in my recent post entitled ‘On the beach’ [24th July, 2013].

If you are thinking of building a wind turbine to extract some of the energy present in the wind, then you would be well-advised to conduct some surveys of the site to assess the potential power output.  The power output of a wind turbine [P] can be defined as a half of the product of the air density [d] multiplied by the area swept by the blades [A] multiplied by the cube of the velocity [v].  So the wind velocity dominates this relationship [P = ½dAv3] and it is important that a site survey assesses the wind velocity.  But the wind velocity is constantly changing so how can this be done meaningfully?

Engineers might tackle this problem by measuring the wind speed for ten minute intervals, or some other relatively short time period, and calculating the average speed for the period.  This process would be repeated over a long period of time, perhaps weeks or months and the results plotted as frequency distribution, i.e. the results would be assigned to ‘bins’ labelled for instance 0.0 to 1.9 m/s, 2.0 to 3.9 m/s, 4.0 to 5.9 m/s etc and then the number of results in each bin plotted to create a bar chart.  The number of results in a bin divided by the total number of results provides the probability that a measurement taken at any random moment would yield a wind speed that would be assigned to that bin.  Consequently, the mathematical function used to describe such a bar chart is called a probability density function.  Now returning to the original relationship, P = ½dAv3 and using the probability density function instead of the wind velocity yields a power density function that can be used to predict the annual output of the turbine taking account of the constantly changing wind velocity.

If you struggled with my very short explanation of probability density functions, then you might try the Khan Academy video on the topic found on Youtube at http://www.youtube.com/watch?v=Fvi9A_tEmXQ

Engineers use probability density functions to process information about lots of random or stochastic events such as forces ocean waves interacting with ships and oil-rigs, flutter in aircraft wings, the forces experienced by a car as its wheels bounce along a road or the motion of an artificial heart valve.  These are all activities for which the underlying mechanics are understood but there is an element of randomness in their behaviour, with respect to time, that means we cannot predict precisely what will be happening at an instant in time; and yet engineers are expected to achieve reliable performance in designs which will encounter stochastic events.  Frequency distributions and probability density functions are one popular approach used by engineers.  Traditionally engineers have studied applied mathematics that was equated to mechanics in high school but increasing they need to understand statistics.