Many of us who have written a scientific paper consider referencing to be a dull and laborious task. However, it is a vital part of the scientific process. Incorrect referencing could facilitate the spread of misleading and inaccurate information. This is particularly important when looking at mental health policy documents, as they aim to be high impact and to be widely implemented.
The Evidence Transparency Framework (PDF), published by Sense about Science in 2016, addresses the importance of referencing quality in policy but doesn’t offer an in-depth assessment of the accessibility and accuracy of cited evidence. However, the medical science field has researched this in more detail.
Within these studies on medical research articles, an accuracy error is used to refer to whether a statement truly reflects the content of its referenced source. Accuracy errors are classified as either minor or major. A minor error is when the author has overgeneralised or slightly misrepresented the source material but, ultimately, the statement remains reasonably congruent to the content of the source. A major error is when the statement has no relation or contradicts the source material. Clearly, accuracy errors would be considered at odds with the aims of the scientific process.
An accessibility error describes how easily available a referenced source is. For example, indirect referencing occurs when a statement discusses the results of a paper and references a secondary review instead of the primary data. This can lead the reader down a referencing rabbit hole. The continual indirect referencing of a source could lead to the original message being diluted or distorted.
The current study by Hui et al. (2019) attempts to develop a framework by which the prevalence of these errors can be assessed in mental health policy documents – an area which had previously been neglected when assessing referencing quality.
The study begins with a pilot stage where the authors used a framework based on Mogull (2017), in which referencing errors in medical research articles were grouped into categories to describe their overall prevalence. Two mental health policy documents were acquired through a web search of the UK government website and only documents with 10 or more references to scientific articles were included.
Two independent reviewers then selected statements in the document that needed the support of empirical evidence given by the source they cited. These were termed ‘factual’ statements. The Mogull (2017) framework was then used to classify these statements as minor or major accuracy errors, direct references, indirect references or inaccessible.
After the pilot phase, the authors found the Mogull (2017) framework to be a feasible tool to use on mental health policy documents. However, they made two amendments for the purposes of their study. These were:
- To include the extra classification of ‘dead-end’ referencing whereby the reference directs the reader to a source that lacks evidence to support the original statement.
- To broaden the definition of ‘indirect’ referencing to encompass evidence sources such as governmental surveys, statistical reports and independent research reports – not just scientific articles as specified in the original framework.
The adapted framework was applied to a total of ten policy documents, including the two from the pilot.
Ten policy documents were included in the study. Within each of these documents 22-25 statements were selected (n=236 total).
Of the total, 236 referenced ‘factual’ statements:
- 141 (59.7%) contained no accuracy errors
- 45 (19.1%) contained major errors
- 50 (21.2%) contained minor errors
Minor accuracy errors
- 21 statements overgeneralised from the source
- 17 contained reporting errors of quantitative results of the source
Major accuracy errors
- 35 statements were unsubstantiated by the source
- 126 (53.4%) contained direct references that successfully supported the statement
- 36 (15.3%) contained indirect references
- 18 (7.6%) led to ‘dead-end’ references
- 11 (4.7%) were completely inaccessible
The results from this study suggest that a sizeable number of statements in the assessed documents were erroneous or uncited. Compared to Mogull’s (2017) original assessment of medical research documents, the standard for mental health policy documents appears to be lower. However, this should be interpreted with caution due to the differences in content and methodologies between the two studies.
The authors ultimately concluded that the accuracy and accessibility of references in mental health policy documents requires more attention. If policy documents are to inform services at a national and local level, then they must prioritise being accurate and informed.
Strengths and limitations
The study presents an innovative and useful framework, being the first of its kind to evaluate the accuracy and accessibility of cited evidence in mental health policy documents. By conducting the pilot phase, the authors were able to tailor the framework to suit the format of policy and to assess the referencing errors in a more meaningful way. Its utility was strengthened by the inclusion of the ‘dead-end’ referencing classification.
However, by the authors’ own admission, the framework has limitations. The inclusion criteria specified that the documents must include at least 10 references to scientific articles. This resulted in many documents being excluded because they either solely cited other policy documents or included too few references. This could potentially obscure the true wider prevalence of errors, since some policy documents couldn’t be represented in the study. This includes shorter documents which would naturally have fewer references. However, this study was only an initial assessment of the framework and the authors had specified this inclusion criteria because they wanted to capture a broader spread of sources. If the framework were to be adopted for general use it would ideally still have utility for shorter documents.
Additionally, since no more than 25 references were chosen in each document, subjective bias from the two reviewers in the statement sampling process could have potentially reduced the variation of factual statements assessed. The authors suggest future studies may use a purposive sampling method to achieve wider variation, assessing every factual statement in the document or applying computer randomisation. The latter would be an effective way of scaling-up and maximising time efficiency in future studies.
The issue of subjectivity further extends to the classification of statements, which were ultimately based on the reviewers’ interpretations. Inter-rater reliability is a well-established point of contention in research. The authors reported accounting for this issue by resolving any conflicts in decisions before proceeding and did well to clearly define the error classifications. However, they did not report an inter-rater reliability statistic and future studies may consider monitoring the inter-rater reliability more closely.
Finally, as noted in the article, the framework can only be used in practice by people who have access to scientific journals and reviewed by those who have experience in reading them. The authors suggest this may limit its practical utility, potentially excluding individuals who work in government and policy themselves. However, it is arguably preferable that individuals who are familiar with scientific research and related constructs conduct these sorts of assessments.
This study highlights the need to hold UK mental health policy documents to a higher standard. Next steps could include implementing robust regulations on correctly citing evidence and mandating independent referencing error checks before mental health policy documents are published. If future studies could identify methods to classify errors more quickly, this could be feasible.
Future studies could also alter the framework for global use. The current framework was developed with UK policy documents in mind, and mental health policies in other countries vary substantially depending on cultural and political differences. The ever-changing environmental and technological landscape is constantly informing how we think about improving mental health. Therefore, a global and systematic method must be able to keep up with this increasing amount of research and evidence.
Additionally, this study only looked at policy documents produced by government organisations. Those disseminated by independent organisations were not assessed, leaving scope for them to be investigated in the future.
Hui A, Rains L, Todd A, Boaz A, & Johnson S. (2019). The accuracy and accessibility of cited evidence: a study examining mental health policy documents. Social Psychiatry and Psychiatric Epidemiology 2019 1-11.
Sense about Science (2016) Transparency of Evidence: an assessment of government policy proposals May 2015 to May 2016 http://senseaboutscience.org/wp-content/uploads/2016/11/SaS-Transparency-of-Evidence-2016-Nov.pdf. Last accessed 06 Dec 2019.
Mogull SA (2017) Accuracy of cited “facts” in medical research articles: a review of study methodology and recalculation of quotation error rate. PLoS One 12(9):e0184727