Publication Bias: Why P values are hopeless (again)

Publication BiaisKühberger A, Fritz A, Scherndl T. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size. PLoS One. 2014 Sep 5;9(9):e105825. doi: 10.1371/journal.pone.0105825. eCollection 2014. Full text here

The negative correlation between effect size and samples size (left, with small samples, only large effects are published because small effects are generally non significant while they could be substantial in magnitude), and the biased distribution of p values (right, people only publish significant findings, ignoring the studies showing a lack of effect – or unclear because not enough power) indicate pervasive publication bias in the entire field of psychology.

On the importance of the smallest worthwhile change determination

This paper shows how the determination of the SWC affects the interpretation of the magnitude of the changes in monitoring variables

Figures J Sport Sci & Med (Large)

Buchheit M, Rabbani A and Taghi Beigi H. Predicting changes in high-intensity intermittent running performance with acute responses to short jump rope workouts in children. J Sport Sci & Med, 2014, In Press.

The aims of the present study were to 1) examine whether individual HR and RPE responses to a jump rope workout could be used to predict changes in high-intensity intermittent running performance in young athletes, and 2) examine the effect of using different methods to determine a smallest worthwhile change (SWC) on the interpretation of group-average and individual changes in the variables. Before and after an 8-week high-intensity training program, 13 children athletes (10.6±0.9 yr) performed a high-intensity running test (30-15 Intermittent Fitness Test, VIFT) and three jump rope workouts, where HR and RPE were collected. The SWC was defined as either 1/5th of the between-subjects standard deviation or the variable typical error (CV). After training, the large ≈9% improvement in VIFT was very likely, irrespective of the SWC. Standardized changes were greater for RPE (very likely-to-almost certain, ~30-60% changes, ~4-16 times > SWC) than for HR (likely-to-very likely, ~2-6% changes, ~1- 6 times >SWC) responses. Using the CV as the SWC lead to the smallest and greater changes for HR and RPE, respectively. The predictive value for individual performance changes tended to be better for HR (74-92%) than RPE (69%), and greater when using the CV as the SWC. The predictive value for no-performance change was low for both measures (<26%). Substantial decreases in HR and RPE responses to short jump rope workouts can predict substantial improvements in high-intensity running performance at the individual level. Using the CV of test measures as the SWC might be the better option.

Key words: submaximal heart rate; rate of perceived exertion; OMNI scale; 30-15 Intermittent Fitness Test; progressive statistics.

“Statistics are our weapons” Nick Broad,1974–2013.

Paris Saint-Germain Training“Nick Broad was an English football nutritionist and sport scientist who worked for some of the biggest football clubs including Blackburn Rovers,  Birmingham City, Chelsea Football Club and Paris St-Germain. Broad was a  close friend of former Chelsea manager, Carlo Ancelotti. He graduated from Aberdeen University. Aged 38, he died on 19 January, 2013 of an accidental traffic collision” (Wikipedia). Nick was an example for a lot of sport scientists working in high performance, including myself. Among many others, I will miss our discussions on the monitoring process.The section 3 of this paper is dedicated to him.


% are not enough to assess the magnitude of an effect

Following one of our publications last year, the data from this recent study are perfect to show how much standardization is important. In the below example, the symbols represent the % differences in several variables between 2 groups of soccer players (more vs. less mature). The numbers above each symbol refer to the standardized differences (effect size), and the stars, to the chances for the differences to be substantial (i.e., clearly greater than the practically important effect, the smallest worthwhile change). Read more here about these key concepts.

What stands out is the lack of association between the % differences and their actual magnitude (i.e., standardized difference). This is particularly evident when comparing the differences in high speed activities (HSA, >45%, stdz diff +0.7) and height (5%, but stdz diff +1.6). In accordance with the standardized differences (and not the %), the chances for the differences in height are greater than for HSA ! Conclusion: don’t trust percentages. Report magnitudes and chances for the differences to be real.

More vs less mature