TY - JOUR
T1 - Unfolding the phenomenon of inter-rater agreement
T2 - a multicomponent approach for in-depth examination was proposed
AU - Slaug, Bjørn
AU - Schilling, Oliver
AU - Helle, Tina
AU - Iwarsson, Susanne
AU - Carlsson, Gunilla
AU - Brandt, Åse
PY - 2012/6/27
Y1 - 2012/6/27
N2 - Objective: The overall objective was to unfold the phenomenon of inter-rater agreement: to identify potential sources of variation in agreement data and to explore how they can be statistically accounted for. The ultimate aim was to propose recommendations for in-depth examination of agreement, in order to improve the reliability of assessment instruments. Study Design and Setting: Utilizing a sample where 10 rater pairs had assessed the presence/absence of 188 environmental barriers by a systematic rating form, a raters items dataset was generated (N=1,880). In addition to common agreement indices, relative shares of agreement variation were calculated. Multilevel regression analysis was carried out, using rater and item characteristics as predictors of agreement variation. Results: The raters accounted for 6-11 % of the agreement variation, the items for 33-39 % and the contexts for 53-60 %. Multilevel regression analysis showed barrier prevalence and raters’ familiarity with using standardized instruments to have the strongest impact on agreement, though for study design reasons contextual characteristics were not included. Conclusion: Supported by a conceptual analysis, we propose an approach of in-depth examination of agreement variation, as a strategy for increasing the level of inter-rater agreement. By identifying and limiting the most important sources of disagreement, ultimately instrument reliability can be improved. Keywords: inter-rater, reliability, sources of disagreement, kappa, methodology, recommendations
AB - Objective: The overall objective was to unfold the phenomenon of inter-rater agreement: to identify potential sources of variation in agreement data and to explore how they can be statistically accounted for. The ultimate aim was to propose recommendations for in-depth examination of agreement, in order to improve the reliability of assessment instruments. Study Design and Setting: Utilizing a sample where 10 rater pairs had assessed the presence/absence of 188 environmental barriers by a systematic rating form, a raters items dataset was generated (N=1,880). In addition to common agreement indices, relative shares of agreement variation were calculated. Multilevel regression analysis was carried out, using rater and item characteristics as predictors of agreement variation. Results: The raters accounted for 6-11 % of the agreement variation, the items for 33-39 % and the contexts for 53-60 %. Multilevel regression analysis showed barrier prevalence and raters’ familiarity with using standardized instruments to have the strongest impact on agreement, though for study design reasons contextual characteristics were not included. Conclusion: Supported by a conceptual analysis, we propose an approach of in-depth examination of agreement variation, as a strategy for increasing the level of inter-rater agreement. By identifying and limiting the most important sources of disagreement, ultimately instrument reliability can be improved. Keywords: inter-rater, reliability, sources of disagreement, kappa, methodology, recommendations
KW - health, nutrition and quality of life
KW - agreement
KW - inter-rater
KW - kappa
KW - methodology
KW - recommendations
KW - reliability
KW - agreement
KW - inter-rater
KW - kappa
KW - methodology
KW - recommendations
KW - reliability
U2 - 10.1016/j.jclinepi.2012.02.016
DO - 10.1016/j.jclinepi.2012.02.016
M3 - Journal article
SN - 0895-4356
VL - 65
SP - 1016
EP - 1025
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
IS - 9
ER -