Tag Archives: validation

Spatial-temporal models of protein structures

For a number of years I have been working on methods for validating computational models of structures [see ‘Model validation‘ on September 18th 2012] using the full potential of measurements made with modern techniques such as digital image correlation [see ‘256 shades of grey‘ on January 22nd 2014] and thermoelastic stress analysis [see ‘Counting photons to measure stress‘ on November 18th 2015].  Usually the focus of our interest is at the macroscale, for example the research on aircraft structures in the MOTIVATE project; however, in a new PhD project with colleagues at the National Tsing Hua University in Taiwan, we are planning to explore using our validation procedures and metrics [1] in structural biology.

The size and timescale of protein-structure thermal fluctuations are essential to the regulation of cellular functions. Measurement techniques such as x-ray crystallography and transmission electron cryomicroscopy (Cryo-EM) provide data on electron density distribution from which protein structures can be deduced using molecular dynamics models. Our aim is to develop our validation metrics to help identify, with a defined level of confidence, the most appropriate structural ensemble for a given set of electron densities. To make the problem more interesting and challenging the structure observed by x-ray crystallography is an average or equilibrium state because a folded protein is constantly in motion undergoing harmonic oscillations, each with different frequencies and amplitude [2].

The PhD project is part of the dual PhD programme of the University of Liverpool and National Tsing Hua University.  Funding is available in form of a fee waiver and contribution to living expenses for four years of study involving significant periods (perferably two years) at each university.  For more information follow this link.

References:

[1] Dvurecenska, K., Graham, S., Patelli, E. & Patterson, E.A., A probabilistic metric for the validation of computational models, Royal Society Open Society, 5:180687, 2018.

[2] Justin Chan, Hong-Rui Lin, Kazuhiro Takemura, Kai-Chun Chang, Yuan-Yu Chang, Yasumasa Joti, Akio Kitao, Lee-Wei Yang. An efficient timer and sizer of protein motions reveals the time-scales of functional dynamics in the ribosome (2018) https://www.biorxiv.org/content/early/2018/08/03/384511.

Image: A diffraction pattern and protein structure from http://xray.bmc.uu.se/xtal/

Establishing fidelity and credibility in tests & simulations (FACTS)

A month or so ago I gave a lecture entitled ‘Establishing FACTS (Fidelity And Credibility in Tests & Simulations)’ to the local branch of the Institution of Engineering Technology (IET). Of course my title was a play on words because the Oxford English Dictionary defines a ‘fact’ as ‘a thing that is known or proved to be true’ or ‘information used as evidence or as part of report’.   One of my current research interests is how we establish predictions from simulations as evidence that can be used reliably in decision-making.  This is important because simulations based on computational models have become ubiquitous in engineering for, amongst other things, design optimisation and evaluation of structural integrity.   These models need to possess the appropriate level of fidelity and to be credible in the eyes of decision-makers, not just their creators.  Model credibility is usually provided through validation processes using a small number of physical tests that must yield a large quantity of reliable and relevant data [see ‘Getting smarter‘ on June 21st, 2017].  Reliable and relevant data means making measurements with low levels of uncertainty under real-world conditions which is usually challenging.

These topics recur through much of my research and have found applications in aerospace engineering, nuclear engineering and biology. My lecture to the IET gave an overview of these ideas using applications from each of these fields, some of which I have described in past posts.  So, I have now created a new page on this blog with a catalogue of these past posts on the theme of ‘FACTS‘.  Feel free to have a browse!

Spontaneously MOTIVATEd

Some posts arise spontaneously, stimulated by something that I have read or done, while others are part of commitment to communicate on a topic related to my research or teaching, such as the CALE series.  The motivation for a post seems unrelated to its popularity.  This post is part of that commitment to communicate.

After 12 months, our EU-supported research project, MOTIVATE [see ‘Getting Smarter‘ on June 21st, 2017] is one-third complete in terms of time; and, as in all research it appears to have made a slow start with much effort expended on conceptualizing, planning, reviewing prior research and discussions.  However, we are on-schedule and have delivered on one of our four research tasks with the result that we have a new validation metric and a new flowchart for the validation process.  The validation metric was revealed at the Photomechanics 2018 conference in Toulouse earlier this year [see ‘Massive Engineering‘ on April 4th, 2018].  The new flowchart [see the graphic] is the result of a brainstorming [see ‘Brave New World‘ on January 10th, 2018] and much subsequent discussion; and will be presented at a conference in Brussels next month [ICEM 2018] at which we will invite feedback [proceedings paper].  The big change from the classical flowchart [see for example ASME V&V guide] is the inclusion of historical data with the possibility of not requiring experiments to provide data for validation purposes. This is probably a paradigm shift for the engineering community, or at least the V&V [Validation & Verification] community.  So, we are expecting some robust feedback – feel free to comment on this blog!

References:

Hack E, Burguete RL, Dvurecenska K, Lampeas G, Patterson EA, Siebert T & Szigeti E, Steps toward industrial validation experiments, In Proceedings Int. Conf. Experimental Mechanics, Brussels, July 2018 [pdf here].

Dvurcenska K, Patelli E & Patterson EA, What’s the probability that a simulation agrees with your experiment? In Proceedings Photomechanics 2018, Toulouse, March 2018.

 

 

How many repeats do we need?

This is a question that both my undergraduate students and a group of taught post-graduates have struggled with this month.  In thermodynamics, my undergraduate students were estimating absolute zero in degrees Celsius using a simple manometer and a digital thermometer (this is an experiment from my MOOC: Energy – Thermodynamics in Everyday Life).  They needed to know how many times to repeat the experiment in order to determine whether their result was significantly different to the theoretical value: -273 degrees Celsius [see my post entitled ‘Arbitrary zero‘ on February 13th, 2013 and ‘Beyond  zero‘ the following week]. Meanwhile, the post-graduate students were measuring the strain distribution in a metal plate with a central hole that was loaded in tension. They needed to know how many times to repeat the experiment to obtain meaningful results that would allow a decision to be made about the validity of their computer simulation of the experiment [see my post entitled ‘Getting smarter‘ on June 21st, 2017].

The simple answer is six repeats are needed if you want 98% confidence in the conclusion and you are happy to accept that the margin of error and the standard deviation of your sample are equal.  The latter implies that error bars of the mean plus and minus one standard deviation are also 98% confidence limits, which is often convenient.  Not surprisingly, only a few undergraduate students figured that out and repeated their experiment six times; and the post-graduates pooled their data to give them a large enough sample size.

The justification for this answer lies in an equation that relates the number in a sample, n to the margin of error, MOE, the standard deviation of the sample, σ, and the shape of the normal distribution described by the z-score or z-statistic, z*: The margin of error, MOE, is the maximum expected difference between the true value of a parameter and the sample estimate of the parameter which is usually the mean of the sample.  While the standard deviation, σ,  describes the difference between the data values in the sample and the mean value of the sample, μ.  If we don’t know one of these quantities then we can simplify the equation by assuming that they are equal; and then n ≥ (z*)².

The z-statistic is the number of standard deviations from the mean that a data value lies, i.e, the distance from the mean in a Normal distribution, as shown in the graphic [for more on the Normal distribution, see my post entitled ‘Uncertainty about Bayesian methods‘ on June 7th, 2017].  We can specify its value so that the interval defined by its positive and negative value contains 98% of the distribution.  The values of z for 90%, 95%, 98% and 99% are shown in the table in the graphic with corresponding values of (z*)², which are equivalent to minimum values of the sample size, n (the number of repeats).

Confidence limits are defined as: but when n = , this simplifies to μ ± σ.  So, with a sample size of six (6 = n   for 98% confidence) we can state with 98% confidence that there is no significant difference between our mean estimate and the theoretical value of absolute zero when that difference is less than the standard deviation of our six estimates.

BTW –  the apparatus for the thermodynamics experiments costs less than £10.  The instruction sheet is available here – it is not quite an Everyday Engineering Example but the experiment is designed to be performed in your kitchen rather than a laboratory.