Does open access publishing increase citation or download rates?
The effect of "Open Access" (OA) on the visibility or impact of scientific publications is one of the most important issues in the fields of bibliometrics and information science. During the past 10 years numerous empirical studies have been published that examine this issue using various methodologies and viewpoints. Comprehensive reviews and bibliographies are given amongst others by OPCIT (1), Davis and Walters (2) and Craig et al. (3). The aim of this article is not to replicate nor update these thorough reviews. Rather, it aims to presents the two main methodologies that were applied in these OA-related studies and discusses their potentialities and limitations. The first method is based on citation analyses; the second on usage analyses.
The debate surrounding the effect of OA started with the publication by Steve Lawrence (4) in Nature, entitled "Free online availability substantially increases a paper's impact", analyzing conference proceedings in the field computer science. "Open access" is not used to indicate the publisher business model based on the "authors pay" principle, but, more generally, in the sense of articles being freely available online. From a methodological point of view, the debate focuses on biases, control groups, sampling, and the degree to which conclusions from case studies can be generalized. This article does not aim to give a complete overview of studies that were published during the past decade but instead highlights key events.
In 2004 Stevan Harnad and Tim Brody (5) claimed that physics articles submitted as pre-print to ArXiv (a preprint server covering mainly physics, hosted by Cornell University) and later published in peer reviewed journals, generated a citation impact up to 400 per cent higher than papers in the same journals that had not been posted in ArXiv. Michael Kurtz and his colleagues (6) found in a study on astronomy evidence of a selection bias – authors post their best articles freely on the web - and an early view effect – articles deposited as preprints are published earlier and are therefore cited more often. Henk Moed (7) found that for articles in solid state physics these two effects may explain a large part if not all of the differences in citation impact between journal articles posted as pre-print in ArXiv and papers that were not.
In a randomized control trial related to open versus subscription based access of articles in psychology journals published by one publisher, Phil Davis et al. (8) did not find a significant effect of open access on citations. In order to correct for selection bias, a new study by Harnad and his team (9) compared self-selective self archiving with mandatory self archiving in four particular research institutions. They argued that, although the first type may be subject to a quality bias, the second can be assumed to occur regardless of the quality of the papers. They found that the OA advantage proved just as high for both, and concluded that it is real, independent and causal. It is greater for more citable articles then it is for less significant ones, resulting from users self-selecting what to use and cite. 1
Two general limitations of the various approaches described above must be underlined.
Firstly, all citation based studies mentioned above appear to have the following bias: they were based on citation analyses carried out in a citation index with a selective coverage of the good, international journals in the fields. Analyzing citation impact in such a database is in a sense a bit similar to measuring the extent to which people are willing to leave their car unused during the weekend, by interviewing mainly people on a Saturday at the parking place of a large warehouse outside town. These people have quite obviously decided to use their car, if they had not, they would not be there. Similarly, authors who publish in the selected set of good, international journals – a necessary condition for citations to be recorded in the OA advantage studies mentioned above – will tend to have access to these journals anyway. In other words: there may be a positive effect of OA upon citation impact, but it is not visible in the database used. The use of a citation index with more comprehensive coverage, would enable one to examine the effect of the citation impact of covered journals upon OA citation advantage. For instance, is such an advantage more visible in lower impact or more nationally oriented journals than it is in international top journals?
Secondly, analyzing article downloads (”usage”) is a complementary and in principle valuable method for studying the effects of OA. In fact, the study by Phil Davis and colleagues mentioned above applied this method and reported that OA articles were downloaded more often than papers with subscription-based access. However, significant limitations of this method are that not all publication archives provide reliable download statistics, and that different publication archives that do generate such statistics may apply different ways to record and/or count downloads, meaning that results are not always directly comparable across archives. The implication seems to be that usage studies of OA advantage comparing OA with non-OA articles can be applied only in “hybrid” environments in which publishers offer authors both the “authors pay” and a “readers pay” option upon submitting a manuscript. This type of OA may however not be representative for OA in general, as it disregards self-archiving in OA repositories that are being created in research institutions all over the world.
Future research has to be aware of these two general limitations, as they limit the degree to which outcomes from case studies can be generalized and provide a simple, unambiguous answer to the question whether Open Access does - or does not - lead to higher citation or download rates.
References1. OPCIT (2012) The Open Citation Project. The effect of open access and downloads ('hits') on citation impact: a bibliography of studies. http://opcit.eprints.org/oacitation-biblio.html. 2. Davis, P.M. and Walters, W.H. (2011) “The impact of free access to the scientific literature: A review of recent research”, Journal of the Medical Library Association, 99, 208-217. 3. Craig, I.D., Plume, A.M. , McVeigh, M.E. , Pringle, J. , Amin, M.(2007) “Do open access articles have greater citation impact? A critical review of the literature”, 1, 239-248. 4. Lawrence, S. (2001)”Free online availability substantially increases a paper's impact”, Nature, 411 (6837), p. 521. 5. Harnad, S., Brody, T. (2004) “Comparing the impact of open access (OA) vs. non-OA articles in the same journals” D-Lib Magazine, 10(6). 6. Kurtz, M.J., Eichhorn, G., Accomazzi, A., Grant, C., Demleitner, M., Henneken, E., Murray, S.S. (2005) “The effect of use and access on citations”, Information Processing & Management, 41, 1395–1402. 7. Moed, H.F. (2007) “The effect of “Open Access” upon citation impact: An analysis of ArXiv’s Condensed Matter Section” Journal of the American Society for Information Science and Technology, 58, 2047-2054. 8. Davis, P.M., Lewenstein, B.V., Simon, D.H., Booth, J.G., Connolly, M.J.L. (2008) "Open access publishing, article downloads, and citations: Randomised controlled trial", BMJ, 337 (7665), 343-345. 9. Gargouri, Y., Hajjem, C., Lariviére, V., Gingras, Y., Carr, L., Brody, T., Harnad, S. (2010) “Self-selected or mandated, open access increases citation impact for higher quality research”, PLoS ONE, 5 (10), art. no. e13636.
 In an earlier version of this piece, published on the Bulletin Board of Elsevier’s Editors Update I included a paragraph about the Gargouri et al. study that appears to be based on a misinterpretation of Table 4 in their paper. I wrote that “But they also found for the four institutions that the percentage of their publication output actually self-archived was at most 60 per cent, and that for some it did not increase when their OA regime was transformed from non-mandatory into mandatory. Therefore, what the authors labeled as “mandated OA” is in reality to a large extent subject to the same type of self selection bias as non-mandated OA.” As Stevan Harnad has pointed out in a reply, Table 4 relates to the date articles were published, not when they were archived. Self-archiving rates are flat over time because they include retrospective self-archiving.