The Challenges of Measuring Social Impact Using Altmetrics
Altmetrics gives us novel ways of detecting the use and consumption of scholarly publishing beyond formal citation, and it is tempting to treat these measurements as proxies for social impact. However, altmetrics is still too shallow and too narrow, and needs to increase its scope and reach before it can make a significant contribution to computing relative values for social impact. Furthermore, in order to go beyond limited comparisons of like-for-like and to become generally useful, computation models must take into account different socio-economic characteristics and legal frameworks. However, much of the necessary work can be borrowed from other fields, and the author concludes that – with certain extensions and added sophistication – altmetrics will be a valuable element in calculating social reach and impact.
Altmetrics is the collective term for scholarly usage data that goes beyond formal citation counts. Typically, altmetric data comes from specialist platforms and research tools but can also include data from general applications and technical platforms. Sometimes the term also encompasses mass-media references, and data from publishers, such as web page views and PDF downloads (see Table 1).
|Types of data||Examples|
|General social networking applications||Mentions, links, ‘likes’, bookmarks to articles||Twitter, Facebook, Del.icio.us|
|Specialized research tools||Links, bookmarks, recommendations, additions to reading groups||Zotero.org, Mendeley.com, Citeulike.org|
|Publisher platforms||Web page views, PDF downloads, Abstract views||PLoS, Scopus, Pubmed|
|Research output, publishing components||Views, recommendations, shares||Github.com, Datadryad.org, Slideshare.net, Figshare.com,|
Table 1 - Classes of platform and tool that provide data for altmetrics applications. (Source: ImpactStory)
The principal use of altmetrics has been to study and describe the wider scholarly impact of research articles (1). Some researchers have concluded that altmetric activity might act as an indicator for eventual citation count (2) and that it might reveal academic engagement not recorded in citation count (3). As scholarly material becomes more widely available with increasing open access publishing, and as people increasingly use social networks, altmetrics could become a valuable part of understanding and measuring social impact.
The interest in quantifying social impact is not restricted to research: it is a field of increasing importance in the not-for-profit sector – both philanthropic and institutional (4) – and there have been attempts to measure the impact of investments in the arts (5). Within the philanthropic field, there is an emerging paradigm that borrows from business, with financial investment reaping social return. Not unsurprisingly, there are agencies that endeavor to assess and compare social impact and businesses that attempt to do likewise for pure profit investment.
The movement towards Gold open access publishing as promoted by the UK’s Finch Report and the EU’s Horizon 2020 project - where funding agencies become responsible for paying the cost of dissemination via research grants to scholars - enables a parallel with not-for-profit investment. In common with charitable funding bodies, it may be predicted that research investment agencies will increase their efforts to monitor the social impact of research outcomes in published articles. Thus, we can expect to see an increase in the amount of attention paid to assessing the social impact and social reach of research outcomes.
Social impact is often quantified in economic terms, using approaches that attempt to put a value on the benefits to the economy. However, while the social impact of a vaccine might be measured by computing the days lost to the economy, the loss of tax revenue and the cost of healthcare, applying the same approach in other fields – for example, studying the roots of cultural resistance to vaccination (6) - is considerably harder.
In this article, I describe an outline of a methodological approach for calculating or computing relative social reach – in other words how research findings can propagate from the published article into the public domain; while understanding the differences in social capacity – the means by which research can influence society, both by means of socio-economic structure, legislation and influential discourse. I also touch on the idea of social accessibility, or how research findings vary in their ability to be communicated and understood by a lay population.
As altmetric data can detect non-scholarly, non-traditional modes of research consumption, it seems likely that parties interested in social impact assessment via social reach may well start to develop altmetric-based analyses, to complement the existing approaches of case histories, and bibliometric analysis of citations within patent claims and published guidelines.
Understanding the social space
In order to begin the task of computing social impact using altmetric data, it is important to understand the varying socio-economic and legislative spaces in which disciplines exist, and to understand the limitations of what activity can be measured. The social space that scholarly endeavor occupies is not common for all disciplines, and it is not necessarily common across national boundaries. The social impact of Medicine is likely to be greater than that of Limnology or pure Mathematics; the study of Literature is politicized in some countries, but not in others (see Table 2).
Furthermore, research that delivers knowledge to practitioners and offers practical help to the lay community is likely to have more potential for a higher social impact and to affect more people if the authors are careful to increase their articles’ social accessibility by the inclusion of keywords, links to glossaries and a lay abstract. Here, publishers have a degree of responsibility, to support researchers in framing descriptions of their work and in developing platforms that are responsive to changing vocabularies. In the case history below, I describe how Nature went to some lengths to provide a social context to a complex story about genetic markers and tests.
Although this effort is commendable when publishing articles that have a high capacity for social influence, in an environment where research is becoming more accessible and where competition for funds is increasing, it behooves both researcher and publisher – both of whom are competing for funds – to increase social accessibility.
Obviously the bulk of most research articles are necessarily written in specialized language, and the addition of keywords, links and a sentence explaining the context of the work would do much to improve the semantic infrastructure and social accessibility through which research finds its social impact. An interesting essay on the importance and skills necessary to communicate research to the wider public may be read in Nature (7).
As the potential for social impact varies, so do the social and government structures that offer a legal and quasi-legal framework in which the research may be expressed: these, in turn, alter a discipline’s capacity for achieving social impact.
|Number of papers published in 2011||123,771||5759||23,727||14,379|
|Number of practitioners in the UK||c250,000 (8)||c700,000 (9)||Thousands, 1000 in government||3000 (globally)|
|Professional governance||Medical Research Council, General Medical Council, NICE||Nursing and Midwifery Council, Royal College of Nursing, NICE||None||None|
|Scholarly impact (5FWRI 2011)||0.91||0.73||0.74||0.81|
|Number of UK Acts of Legislation relating to the practice of this profession (10).||78 UK Acts of Legislation relating to “General Medical Council” with more than 200 of wider relevance.||152 UK Acts specifically relate to Nursing, with more than 200 of wider relevance.||3 Acts for “economists”||30 UK Acts for “mathematics” (all education) and 3 Acts for “mathematician”|
Table 2 - the socio-legal structure and potential for social impact of four research disciplines in the UK
Clearly, different disciplines and discoveries will reach their maximum impact within highly varying timescales. For example, one of the greatest discoveries was probably the development of the concept and number zero, which took place in several cultures and over many centuries, whereas the hypothetical discovery of a large meteorite heading for Earth would have a larger impact in a considerably short period.
The differences between disciplines’ structures and their relationship with the tools that affect social change imply that – at best – a multifactorial approach that can be tuned to focus on different disciplines would be needed to quantify the social impact of scholarly research. In the light of the lack of agreement on what social impact means, and the manifestly complicated background, it is hardly surprising that Bornmann concluded in 2012 that in the absence of any robust evaluations, the best way ahead is by peer review.
One profound difficulty in measuring social impact is the complex ways in which research can affect change. For example, there are relatively few economists, and while primary economic research rarely makes headline news, the impact through politics, finance and international agency is dramatic and far-reaching (see Figure 1).
An interesting example of when primary economic research does come to attention and an illustration of the disproportionate nature of social mentions and impact can be seen in the 2013 criticism of Reinhart and Rogoff’s 2010 paper “Growth in a Time of Debt” (12). The paper is described as a ‘foundational text’ (13) of austerity programs and according to ImpactStory received fewer than 100 social mentions. The methodological critique that discovered Excel errors and other problems received 250 social citations.
Figure 1 - Google search trends for “Reinhard Rogoff”
In the UK, there is no governance for economists, which can be contrasted with the various healthcare professions, which have many complex layers of professional and governing bodies, all of which work to affect social impact, as delivered by professionals. Within these formal channels, it is possible to apply bibliometrics by a citation analysis of the documents produced by governing bodies. However, as the distance from primary research to lay population increases, so does the lack of formal citation or linking.
Although it is tempting to equate social reach (i.e., getting research into the hands of the public), it is not the same as measuring social impact. At the moment, altmetrics provides us with a way of detecting when research is being passed on down the information chains – to be specific, altmetrics detects sharing, or propagation events. However, even though altmetrics offers us a much wider view of how scholarly research is being accessed and discussed than bibliometrics, at the moment the discipline lacks an approach towards understanding the wider context necessary to understand both the social reach and impact of scholarly work.
There have been attempts to create a statistical methodology that defines different types of consumption. Priem et al (14) reported finding five patterns of usage:
- Highly rated by experts and highly cited
- Highly cited
- Highly shared
- Highly bookmarked, but rarely cited
Although these patterns of behavior are of potential interest, the authors do not attempt to correlate the clusters with scholarly and non-scholarly use. In fact, a literature search found no research currently available that compared disciplines or readership background using altmetric data. It is not surprising, therefore, to find that there is no research that focuses on the relationship between scholarly research and social consumption using altmetric data.
The challenge of measuring social impact and social reach with altmetrics
In order to provide some insight into how altmetrics might be used to measure social reach, and potentially enable the measurement of social impact, I investigated a high profile story that originated in primary research.
On March 27/28, 2013, all the major UK news outlets carried stories based on research that found genetic markers for breast, prostate and bowel cancel. The research reported significantly better accuracy for these markers than previous research. Mass media reports of the research suggested the possibility that within eighteen months (15) or five years (16), a saliva-based screening test for the genetic markers might become available via the UK’s National Health Service, at a cost to the NHS of between £5 and £30.
Some of the commentary included in the reporting came from the principal authors of the research, although there was no obvious linguistic cue or statement of interest (link to the Guardian), thus making the assignment of provenance a separate research project in itself.
This research is likely to have a strong social impact, as the tests are expected to be more accurate than present, can be undertaken at any stage of life and can be coupled with higher detection rates at earlier stages of cancer, with corresponding improvements in lifespan and quality of life. This is likely to be expressed though practitioners and their governing bodies, Government agencies, etc.
Despite the high potential for social impact, and links in the highest read online news stories to a dedicated home page set up by Nature to enable lay-consumption of the primary research, there was very little social activity relating to either the original research, or the essays that Nature had commissioned. Of all the papers linked from this dedicated page, only one was behind a pay wall (see Figure 2. A live altmetric report of this story may be viewed at ImpactStory (17)).
Only two of the mass media articles (the BBC (18) and The Guardian (19)) provided links to the original research. Not unsurprisingly, the stories resulted in a great deal of engagement in social media. However, a review of tweets, comments (323 on The Guardian’s article) and links to the mass media reports found that none was linked to the research, or used any helpful hash tag that would have helped disambiguate tweets about the test versus any other news relating to the forms of cancer.
As the collection of altmetrics is based around following links, a proportion of stories originating from the primary research are immeasurable, and research that constrains itself purely to an altmetric analysis is unlikely to add any helpful indication of social impact at this current period.
As the findings of the research flow out from the research papers, they undergo a series of transformations: they lose their technical language in favor of a lay presentation, the precise findings are replaced with interpretation, and information is added that attempts to predict social impact. In the case of the “£5 Spit Test for Cancer”, some of this interpretative layer is added by the primary researchers and some by other agents. In the course of this evolution, some terms emerge that fit the story, and it is typically these terms that are used by the lay community to discuss the research, along with links back to the mass media articles.
The failure of social and mass media reports to formally cite or link the journalism and commentary to the original research – despite Nature’s best efforts to make the research accessible to the general public – provides an indication that any effort to use existing altmetrics to gauge social reach of primary research is likely to be a worthless endeavor, and at best requires considerable more research. Unfortunately, the size of the altmetric figures for the primary research is insignificant, as is the number of visits to the Nature story page, and are too low to be used for statistically extrapolating social reach from direct social mentions.
Clearly this research was subject to discussion and sharing, amongst the population, but equally clearly, the bulk of this interest is at present as invisible to altmetrics, as it is to bibliometrics. In part, this problem is conceptual, perhaps derived from a desire to maintain a comparison between bibliometrics and altmetrics by restraining the latter’s reach to citation counts; perhaps it is purely a technological problem – however, whatever the cause, the result is the same: altmetrics provides a very weak picture of social reach and social impact.
To some extent, it is possible address the technological issues by extending existing altmetric tools to capture a richer set of data, for example, by accessing the number of comments that have been made on correctly linked articles. Unfortunately, these comments are three steps away from a link to the original research, as the Guardian links not to the papers, but to the dedicated page published by Nature (see Table 3).
|Distance between social reference and original research|
|0||Original research paper linked from:|
|1||Nature’s dedicated page linked from:|
|2||Article in The Guardian linked from:|
|3||Comments on The Guardian, tweets about the newspaper article|
Table 3 - As currently formulated, altmetrics only counts direct links to research material and therefore excludes many mass media and social media mentions. In the example in this table, only the page on nature.com links to the original research.
We cannot expect or mandate people to cite original research in their social dialogue, but it is possible to consider an approach that might allow us to study trends in related terms, and to incorporate these data points in our analyses. Within the field of natural language parsing, it is common to look at the coincidence of occurring terms in formally linked articles, and to use this data to infer meanings and relationships, which could be used to classify articles that lack the formal link or citation.
For example, in the mass media articles relating to the “£5 cancer test”, there are a number of entities – researchers, commentators, funding agencies, specific references to particular formal terms – that are common to many stories and blog posts that cover and interpret this research. That these are published within a similar time frame, and have a commonality of semantics, should allow researchers to compute an analysis of similarity, and by mapping these articles and mining the internet, it should be possible to achieve a wider understanding of the social reach of research. Such a study - the quantification of semantics, which might be known as semantometrics - would form ad hoc networks of related stories, commentary, and other social media, from which altmetric data could be harvested for an analysis of social reach (see Table 4).
|Primary research||Practitioner research||Governance and Government||Mass Media||Social Media|
|Usage statistics||Usage statistics||Usage statistics||Usage statistics|
Table 4 - the development of analytics to compute social reach requires a variety of linking approaches, including extending altmetrics beyond direct linking and the application of semantic technology to discover non-linked influence
Although altmetrics has the potential to be a valuable element in calculating social reach – with the hope this would provide insights into understanding social impact – there are a number of essential steps that are necessary to place this work on the same standing as bibliometrics and other forms of assessment.
There needs to be more effort on behalf of altmetricians to extend their platforms to harvest data using direct relationships (e.g., comments on stories that contain formal links, retweets, social shares) to give a wider picture of social reach, both in terms of depth (or complexity) of the communication, and the breadth of relatively simple messages.
As highly influential stories have –at best – idiosyncratic links to the primary research, there should be investigations in the area of using semantics and natural language parsing to trace the spread of scientific ideas through society, and in particular to the application of semantic technologies to extend the scope of altmetrics.
The difference between the ways in which different disciplines discuss, interpret and share research findings needs to be understood. This step should enable publishers and researchers to improve the accessibility of research to practitioners and academics in response to experimental data.
For different disciplines usage patterns will vary according to differences in their social, legislative, economic and national characteristics and infrastructure. Research has a complex and dynamic context, and attempts to make comparisons must acknowledge these variations.
Figure 2 - Counts of tweets linking to primary research and a selection of online reports, and to the Nature dedicated page. The sum of all tweets linking to the primary research was 133 in March 2013.
References(1) A good introduction to the ambitions of altmetrics may be found at altmetrics.org/manifesto (2) Thelwall, M., Haustein, S., Larivière, V., Sugimoto, C.R. (2013) “Do altmetrics work? Twitter and ten other social web services”. Available at: http://www.scit.wlv.ac.uk/~cm1993/papers/Altmetrics_%20preprintx.pdf (3) Priem, J., Piwowar, H.A., Hemminger, B.H. (2011) “Altmetrics in the wild: An exploratory study of impact metrics based on social media (presentation)”. Available at: http://jasonpriem.org/self-archived/PLoS-altmetrics-sigmetrics11-abstract.pdf (4) Ebrahim, A. (2013) “Let’s be realistic about measuring impact”, http://blogs.hbr.org/hbsfaculty/2013/03/lets-be-realistic-about-measur.html (5) Reeves, M., (2002) “Measuring the economic and social impact of the arts: a review”, http://www.artscouncil.org.uk/media/uploads/documents/publications/340.pdf (6) Davis, V. (2012) “Humanities: the unexpected success story of the 21st century”, http://www.ioe.ac.uk/Virginia_Davis_2012.pdf (7) Radford, T. (2011) “Of course scientists can communicate”, http://www.nature.com/news/2011/110126/full/469445a.html (8) General Medical Council, “The state of medical education and practice in the UK: 2012”, http://data.gmc-uk.org. (9) According to the Nursing and Midwifery Council, http://www.nmc-uk.org/About-us/Annual-reports-and-statutory-accounts, there are 671,668 nurses and midwives who are legally allowed to practice in the UK. Approximately 350,000 are employed by the NHS. http://www.nhsconfed.org/priorities/political-engagement/Pages/NHS-statistics.aspx (10) UK Legislation, Full text searches on April 24, 2013 on http://www.legislation.gov.uk (11) Wikipedia, “0 (number)”, http://en.wikipedia.org/wiki/0_%28number%29#History (12) Reinhart, C.M., Rogoff, K.S., (2010) “Growth in a Time of Debt”, American Economic Review, American Economic Association, Vol. 100, No. 2, pp. 573-578, http://www.nber.org/papers/w15639 (13) Linkins, J. (2013) “Reinhard Rogoff austerity research errors may give unemployed someone to blame”, Huffington Post, http://www.huffingtonpost.com/2013/04/16/reinhart-rogoff-austerity_n_3095200.html (14) Priem, J., Piwowar, H.A., Hemminger, B.H. (2012) “Altmetrics in the wild: Using social media to explore scholarly impact”, http://arxiv.org/html/1203.4745v1 (15) Mail Online, http://www.dailymail.co.uk/sciencetech/article-2299971/Simple-saliva-test-breast-prostate-cancer-soon-available-GP-just-5.html (16) The Times, http://www.thetimes.co.uk/tto/health/news/article3724498.ece (17) ImpactStory, http://www.impactstory.org/collection/dnwpb3 (18) BBC, http://www.bbc.co.uk/news/health-21945812 (19) The Guardian, http://www.guardian.co.uk/science/2013/mar/27/scientists-prostate-breast-ovarian-cancer