Usage: an alternative way to evaluate research
From the darkness to the light
Librarians have long struggled to measure how library resources where being used: for decades, reshelving and circulation lists were the main methods available to them. Publishers had no idea how much their journals were used; all they had was the subscription data (e.g number and location of subscribers, contact details, etc). With the advent of electronic content n the late 1990s this changed: publishers could see how often articles from a certain journal were downloaded, and by which customers. Librarians could now see whether and how the resources they purchased were being used. Both groups gained a wealth of information that could help them manage their publications and collections.
Joining efforts towards common standards
It wasn’t long before the need for standardization emerged. Every publisher had its own reporting format, meaning that for librarians combining data and comparing definitions from various publishers took a lot of time and effort. In March 2002, Project COUNTER (Counting Online Usage of Networked Electronic Resources) (1) was launched. In this international initiative, librarians, publishers and intermediaries collaborated by setting standards for the recording and reporting of usage statistics in a consistent, credible and compatible way. The first Code of Practice was published in 2003. This year, COUNTER celebrates its 10th anniversary and has published the fourth release of its integrated Code of Practice, which covers journals, databases, books and multimedia content. This release contains a number of new features, including a requirement to report the usage of gold open access articles separately, as well as new reports about usage on mobile devices. The COUNTER Code of Practice specifies what can be measured as a full text request, when a request needs to be ignored in the reports, and the layout and delivery method of the reports. They also require an annual audit of the reports, with an independent party confirming that the requirements are met.
What usage can tell us
What is a full text article request in fact? A full text article is defined as the complete text of an article including tables, figures and references. When a user requests the same article in the same format within a certain time limit, only one of the requests can be counted. There is a lot of value in usage information: a librarian can see which titles are used most. Cost per article use can be calculated, which can give an indication of the relative value of a journal. In times of tight budgets, it might be considered the most important measure determining cancelations.
What usage does not tell us
While requests for full text give an indication of user interest, it doesn’t tell you how the article is being used. In a way, the requests are like the orders in a webshop: it tells you an item has been ordered, but it doesn’t tell you whether the user receives it or if it’s lost during shipping. It doesn’t tell you what the user does with the item when it is received: do they give it away, put it on their shelves or actually use it – and if so how? The usage data certainly doesn’t tell you why the article was requested: did the professor tell the students to download it, is it vital for research, does the user want it “just in case”, or is the title so funny that someone wants to hang it near the coffee machine?
Using usage data
Information on the actual articles being used can give an indication of the direction a field is growing in Usage data can reveal an interest in a particular subject if relevant articles are used more than those on other subjects. It can also provide geographical information as to the regional spread of the interest. Usage data is by no means the only indicator, but it can provide insight into trends sooner after article publication than citations do. Two initiatives are at the forefront of usage data implementation: the MESUR project in the USA, and the Journal Usage Factor in the UK.
The Journal Usage Factor
The Journal Usage Factor (UFJ) project, a joint initiative between UKSG and COUNTER, has recently released “The COUNTER Code of Practice for Usage Factors: Draft Release 1”. In this document, the publication and usage period used for the calculation are defined as two concurrent years: this means that the 2009-2010 UFJ will focus on 2009-2010 usage of articles published in 2009-2010. The UFJ will be the “median value of a set of ordered full-text article usage data”(1). It will be reported annually as an integer, will integrate articles-in-press from the accepted manuscript stage, and will incorporate usage from multiple platforms. At this stage it is proposed that there will be two versions of the UFJ:
- One based on usage to all paper types except editorial board lists, subscription information, and permission details.
- One based on scholarly content only (short communications, full research articles, review articles).
The draft of the project document is available until 30 September 2012 for public consultation in the form of comments to the COUNTER Project Director Peter Shepherd. Based on the feedback received, the Code of Practice will be refined prior to implementation in 2013. Research Trends will keep an eye on the project and report any further development online through www.researchtrends.com. Peter Shepherd commented that “one of the main benefits of a statistically robust Usage Factor will be to offer alternative insights into the status and impact of journals, which should complement those provided by Impact Factors and give researchers, their institutes and their funding agencies a more complete, balanced picture”
How does usage compare to citations?
COUNTER and UKSG (UK Serials’ Group) commissioned extensive analyses from the CIBER research group into the proposed JUF. In 2011they published their findings in a report that included correlation analyses between theUFJ and a couple of bibliometrics indicators (SNIP and Impact Factor). For both analyses, they found low correlations: results which they did not find surprising as they “did not expect to see a clear correlation between them. They are measuring different things (`votes’ by authors and readers) and the two populations may or may not be co-extensive” (2). Although highly cited papers tend to be highly downloaded, the relationship is not necessarily reciprocal (particularly in the practitioner-led fields). Indeed, while users encompass citers they are a much wider and more diverse population (academics but also students, practitioners, non-publishing scientists, layperson with an interest, science journalists, etc.). There have been several bibliometrics studies comparing usage to citations and findings vary in degree of correlation depending on the scope and subject areas of the studies (3). A 2005 study by our Editor-inChief Dr. Henk Moed (4) found that downloads and citations have a different age distribution (see Figure 1)), with downloads peaking then tailing off promptly after publication, but citations showing a more even (though still irregular) distribution for a much longer time after publication. The research also found that citations seemed to lead to downloads: as an article is published citing a previous article, a spike is observed in the usage of the first article. These interesting results may not be surprising, as Dr. Henk Moed comments, “Downloads and citations relate to distinct phases in scientific information processing.”He has since performed more analyses correlating early usage with later citations, and found that in certain fields usage could help predict citations (e.g. Materials Chemistry), but in others the correlation was too weak to allow this (e.g. Management).
Where will usage go?
Usage data’s increasing availability has been matched by a seemingly rising interest in the field of bibliometrics but also more general academic communities. Although there is still a strong focus on citation metrics, the advent of COUNTER and other projects such as MESUR demonstrate the growing attention given to usage data. Yet it is still early days for usage: although a lot is happening in this relatively new field, it will take time to reach the levels of expertise and familiarity attained with the more traditional citation data. The Usage Factor is one of the first and most visible initiatives: it will be fascinating to monitor its deployment in the coming years, and see what other exciting and perhaps unexpected indicators will emerge from usage data in the future.
Figure 1- Age distribution of downloads versus citations. Source: Moed, H.F. (4)