Articles

Research trends is an online magazine providing objective insights into scientific trends based on bibliometrics analyses.

Predicting citation counts

Ron Daniel describes what earlier research on citation prediction can tell us about potentially valuable altmetrics and debates on whether there are areas in which new altmetrics might be discovered.

Read more >


Abstract

Many articles have been written about efforts to predict how many citations a research article will receive, based on indicators available before or shortly after publication. These efforts have widely varying results, with one effort predicting 14% of the variance in citation while another study ten years later reached over 92%. What was learned in that decade? What can this tell us about potentially valuable altmetrics, and are there areas in which new altmetrics might be discovered?

 

1 Introduction

This special issue of Research Trends is about altmetrics – alternatives to the use of citation counts as the metric for assessing the impact of an article, a researcher or a journal. Citation counts do not tell the whole story (e.g. they don't value useful research software tools, useful advisory papers to young researchers, or research that can't be published for commercial or government security reasons). Having additional metrics to provide a more complete picture is a very welcome development. However, even in a future in which additional metrics are available to assess impact, citation counts will remain first among equals because of their intimate connection with the text of the article and the article's basis on prior work.

The current and continued importance of citation counts has led to the desire to predict how often an article will be cited in order to predict its future importance. Such predictions could be used to decide if an article should be published in one journal vs. another, to flag new research for scrutiny before citation counts have had time to accrue, to assess the development of a young researcher before many counts could have accrued, etc. Many articles have been written on this topic, but there has been very little consistency in their results. Four studies between 2002 and 2012 found that they could predict 14% 1), 20% (2), 60% (3) and 90% (4) of the variance in citation counts a few years after publication based on features available before or shortly after publication. A discrepancy this great requires some explanation!

This article has three goals. The first is to explain the discrepancy in the previous research results. The second is to evaluate the various indicators, a.k.a. features, which were used in the four articles. Features that are predictive of eventual citation counts might be particularly valuable altmetrics that serve as leading indicators of an article's merit. We need to be cautious when comparing results across the studies as they use different scientific domains, make predictions over different time periods, use different statistical methods, obtain results through different procedures, etc. For example, one study measured "newsworthiness" by having readers estimate it; another did so by searching news archives. Both found it to be a notable factor but not necessarily statistically significant. All of this means the results are only loosely comparable. However, we can look within each study to see which features did have significant effects and the relative magnitude of those effects. If the same feature is found to be significant, or not, across all studies then we are fairly safe in drawing conclusions about its utility. The third goal is to see if we can draw conclusions about potentially valuable altmetrics and areas where new altmetrics might be discovered.

 

2 Prediction Features for Pre-Publication Articles

From the time an article is first conceived, features begin to accrue that we can use to predict its future citation counts. This section looks at features available from the inception of the article up to the time where it has been accepted for publication in a journal. The features can be further subdivided into those that apply to the article itself, to the authors of the article, and to the journal which has accepted the article for publication. Those three categories are named as Content, Author, and Venue (4). What we will see is that even before an article is published, we have enough information to make fairly good predictions about its future rate of citations.

2.1 Content

Study Design Factors:

The earliest article we review (1), published in 2002, made the assumption that high quality research would be more heavily cited. They thought about what made high quality research and looked for corresponding features such as sample size, controls, blinding, etc. Sample size and the presence of a control group were found to have some effect, but not to the level of statistical significance. The other factors (blinded, randomized, prospective v. retrospective) were even weaker. The second article (2), published in 2007, also looked at study design factors and found them to have little effect. What they did find, however, was that large studies funded by industry, and with industry-favoring results, were statistically significant predictors of higher rates of citation in the future. These features are understandably important in the medical therapeutic space. Such studies are likely to show drugs and other therapies soon to be available. These factors don't seem likely to generalize to other domains.

Topic:

Unlike the first study, which was confined to emergency medicine, the second study (2) considered the effect of the topic of the article. They found that cardiovascular and oncology articles were more likely to be cited than those on other topics such as anesthesiology, dermatology, endocrinology, gastroenterology, etc. Given the relative death rates of heart disease and cancer to the implications of the other specialties, this seems reasonable. Similarly, the third article (3), published in 2008, found that articles which provided therapeutic information were more cited, as were those which provided original research as opposed to review articles. That study also found that longer articles were cited fewer times, in a weak but statistically significant way. It also found that the more references an article contained, the more likely it was to be cited, although this effect was weak and not significant. The fourth article (4), published in 2012, found a weak effect that the more topics an article covered the higher the number of citations it received.

Table 1 lists the content-based features available before publication which were used in the four studies. Statistically significant values are highlighted. The key things to notice in this table are how few content-based features are significant, and how few of the features are used in multiple studies.

 

Callaham
2002 (1)
Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
# Study Participants 26.5 % 3.1, p=.04 < .001, p=.295
Newsworthiness Score 26 % 13.5, p<.001 .133, p=.161
Control Group 24.3 %
Quality score 15.8 %
Explicit Hypothesis 4.7 %
Prospective v. Retrospective Study 2.7 % 3.6, p=.01 .477, p=.009
Type of Study Participants 2.1 %
Blinded .07 %
Randomized 0 13.4, p=.01
Positive results 0
Industry Funding 19.9, p<.001
Industry Favoring Result 19.4, p<.001
Location of Study 11.9, p=.001
Topic 17.8, p=.001  
Original v. Review article .477, p=.009
# pages -.011, p< .001
Structured abstract -.8, p=.002
# cited references .004, p=.008
Multicenter study .367, p=.014
Therapy v. other article .339, p=.023
Word count of abstract -.0003, p=.658
Semi-structured abstract .071, p=.746
Nation of first author -.037, p=.762
Novelty .059
Topic rank .079
Diversity of topics in article .157

Table 1 - Content-based Features Available Pre-Publication

 

2.2 Author

The effects of the author were not considered in the first study (1).The second study (2) only looked at whether the author byline indicated group authorship. This was found to be the most significant prediction feature in their study! This was a very important result. It indicated that article importance or quality was not easily measured by the presence or absence of some features we might call "good research practice". That realization led to significantly improved prediction accuracy in later work.

The last two papers (3, 4) looked at author-related features in more detail.

Both Lokker (3) and Yan (4) looked at the count of the number of co-authors. Lokker (3) found that count to be a significant factor, but Yan (4) did not. Yan looked at several other author-related features. The Maximum Past Influence of the Author (MPIA) is the citation count for the author's most-cited paper. The Total Past Influence of the Author (TPIA) is the sum of the citation count across the author's body of work. The MPIA was found to be predictive but the TPIA was essentially useless,

A strong result in (4) was the author's rank in citation counts. The citation counts for all the author's works were averaged, and the average counts were sorted to rank the authors. Figure 1, reproduced from (4), shows that being a very highly cited author is predictive of future citation counts. The rich get richer in other words. As can be seen however, this effect is limited and is only strong for authors in the top ranks of citation frequency.

Figure 1 - Citation Counts vs. Rank of Author's Average Citation Count

Figure reproduced from Yan et al (2012) (4). We have sought permission for re-use of this figure.

Considerable attention has been paid to author-related factors in articles beyond the four we review here. (3) provides citations of articles that look at other effects such as nationality, gender, and alphabetic order of the author names.

Table 2 summarizes the effect of the author-based features available before publication. The key thing to notice is that the earliest study made no use of author information, while the latest and most accurate article tried many author-based features.

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
#authors 20.3, p<.001 .087, p<.001 .056
Nation of first author .037, p=.762
Author rank (by citations) .593
h-index .244
MPIA (Max past influence) .585
TPIA (Total Past Influence) .048
Productivity .198
Sociality .249
Authority .155
Versatility .160
Recency .101

Table 2 - Author-Based Features Available Pre-Publication

 

2.3  Venue

The only statistically significant variable found in the first study (1) was the impact factor of the journal in which the article was published. This was an early indication of the power of the venue in determining future citation counts. If we know the journal the article will be published in, we can make more confident predictions about its eventual citation count.

The second study only considered three of the top-line medical journals – JAMA, NEJM, and The Lancet. Nevertheless, they found a significant difference in citation rates between articles in those publications.

The third study did not use the impact factor, as it did not apply to all their sources for content. They discovered other measures that also reflected the article's venue. The strongest are the number of databases that index the journal, and the proportion of articles from the journal which are abstracted within two months by Abstracting and Indexing services and synoptic journals.

Table 3 summarizes the effect of the venue-based features available before publication. Note that no feature is used in more than one study. Curiously, impact factor was the only significant feature found in (1), but it is not used in the later studies. Perhaps the most surprising outcome summarized in this table is the strong effect due to the venues chosen by secondary publication sources like databases, A&I services, and synoptic journals. Given the concerns we all have about infoglut, it is both interesting to see the strength of this effect, and concerning that these effects do not seem to have been featured in any previous altmetric studies. More research in this direction seems justified.

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
Impact factor of publishing journal Strongest factor, relative contribution = 100 %
Accepted for presentation at meeting 5.5 %
Journal 16.3, p < .001
Month of Publication 0.7, p=.5
Proportion of articles abstracted 8.18, p<.001
# databases indexing .039, p<.001
Venue Rank .337
Venue Centrality .049
Max Past Influence of Venue .329
Total Past influence of Venue .023

Table 3 - Venue-Based Features Available Pre-Publication

 

3 Prediction Features for Newly Published Articles

By publication time, we know many facts about the Content, Author, and Venue. In the newly published phase of the article's lifecycle we shift our attention to early perceptions of the quality of the article, and to early indications of the use of the article.

The previous section showed that venues whose articles were frequently selected for abstraction tended to have more highly cited articles. For a single article, the number of times it is abstracted is also a statistically significant predictor (3) which is not available until shortly after publication. That study also showed that articles which were judged "clinically relevant" by the staff of a recommendation service were significantly more likely to have more citations in the future. These results are notable for the same reason as the venue results in the previous section – secondary publication sources have a predictive effect which is not being captured in current altmetrics.

There are many features that could give us early indications of how often articles are being used, or the perceptions that the early users have of them. Those include:

  • Preprint access counts from arXive, etc.
  • General Social Media mentions (Twitter, Facebook, …)
  • Scientific Social Media mentions (Mendeley, del.icio.us, CiteULike, …)
  • Sentiments expressed in early mentions
  • Early download counts from services like ScienceDirect
  • Early citations of the article shown in services like Scopus

These features were not used in the four studies, but there is good reason to believe that these features will be useful in predicting future citation counts. As mentioned in (3):

 “Thirty three per cent of the variance in citation counts of BMJ articles were found to be based on counts of online hits and number of pages (5).”

Table 4 shows the effect of features available shortly after the article is published. The most noticeable aspect of this table is that very few post-publication features were used in the studies other than (3).

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
Newsworthiness Score 26 % 13.5, p<.001 .133, p=.161
Abstracted in evidence based medicine journals .839, p<.001
Clinical Relevance score .418, p<.001
# disciplines rating the article .038, p=.371
Time to article being rated -.009, p=.513
# views or alerts sent -.069, p=.938

Table 4 - Features Available in First Months of Publication

 

4 Prediction Features for Mature Articles

The fourth article (4) looked at temporal factors such as age of the article, as well as regression constants to control the growth and decay of citation rates over time. These results were not strong and other studies did not look at features for mature articles so a summary table is not provided. While none of the studies made significant use of features that become available later in the publication lifecycle, there is no shortage of possibilities. For example, we might look at a Page-Rank like scoring of the influence of the papers citing the particular paper of interest.

Nevertheless, the short story is clear. By the time an article is a few months old, we can make good predictions of its likelihood of future citations - especially for those articles which end up being highly cited. Lokker noted that for the papers with the highest citation counts at two years after publication, "Cited articles in the top half and top third were predicted with 83% and 61% sensitivity and 72% and 82% specificity" (3). In other words, only about 20% of the papers which ended up being highly cited were not predicted to be that way.

 

5 Conclusion

Despite low performance in early studies (14% in 2002), it has become clear over time that it is possible to make good predictions (92% in 2012) of the frequency of future citations. How was this advance achieved? Quite simply, the features being used in the later studies are very different from those used in the earliest ones. The early studies tried to use features around the content, but later work found those to be the weakest while features around the Author and Venue were the most predictive. If we set the power of the Author features to 1.0, the relative power of the Venue and Content features would be about .63 and .25, respectively. We cannot directly compare results across columns, and it is not safe to predict the accuracy any new study might achieve. All of the studies used different domains of literature, predictions over different time periods, different statistical measures, etc. Nevertheless, the pattern seems clear.

It is also interesting, and mildly reassuring, to see that the strongest of these measures operate, to some degree, in a manner independent of each other. Author and Venue are the two most predictive features. However, selecting an article for a journal is usually done in a peer review process that is blind to the identity of the author. Note that this also means these measures are not well-suited for an editorial board to choose articles, since the Venue would be constant and they could not look at the author's publication rank.

In a perfect world, the content of an article would determine its future citation count. We do not, however, have any easily-computed metric for the intrinsic quality and merit of an article. This is where Lokker's results about the importance of secondary sources such as the databases and synoptic journals are most interesting. We see that in the absence of reliable, easily-computed metrics, the subjective human-in-the-loop procedures of peer review, editorial boards, selection for secondary publications, and scientific reputation provide existing mechanisms which fill that void. This provides a potential area of altmetric research to obtain such measures in various fields and compare them with current altmetrics for a variety of purposes.

 

References

(1) Callaham, M., Wears, R.L., Weber, E. (2002) "Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals", JAMA, Vol. 287, pp.2847-50, Available at http://www.ncbi.nlm.nih.gov/pubmed/12038930/
 
(2) Kulkarni, A.V., Busse, J.W., Shams, I. (2007) "Characteristics associated with citation rate of the medical literature", PLOS ONE; 2:e403, Available at http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000403
 
(3) Lokker, C., McKibbon, K.A., McKinlay, R.J., Wilczynski, N.L. and Haynes, R.B. (2008) "Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study", BMJ, Mar 22, 2008; Vol. 336, No. 7645, pp. 655–657. Available online: http://dx.doi.org/10.1136/bmj.39482.526713.BE
 
(4)Yan, R., Huang, C., Tang, J., Zhang, Y., and Li, X. (2012) "To Better Stand on the Shoulder of Giants", JCDL, Available at: http://keg.cs.tsinghua.edu.cn/jietang/publications/JCDL12-Yan-et-al-To-Better-Stand-on-the-Shoulder-of-Giants.pdf
 
(5) Perneger, T.V. (2004) "Relation between online “hit counts” and subsequent citations: prospective study of research papers in the BMJ", BMJ, Vol. 329, No. 7465, pp. 546-547, Available at: http://www.bmj.com/content/329/7465/546
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Abstract

Many articles have been written about efforts to predict how many citations a research article will receive, based on indicators available before or shortly after publication. These efforts have widely varying results, with one effort predicting 14% of the variance in citation while another study ten years later reached over 92%. What was learned in that decade? What can this tell us about potentially valuable altmetrics, and are there areas in which new altmetrics might be discovered?

 

1 Introduction

This special issue of Research Trends is about altmetrics – alternatives to the use of citation counts as the metric for assessing the impact of an article, a researcher or a journal. Citation counts do not tell the whole story (e.g. they don't value useful research software tools, useful advisory papers to young researchers, or research that can't be published for commercial or government security reasons). Having additional metrics to provide a more complete picture is a very welcome development. However, even in a future in which additional metrics are available to assess impact, citation counts will remain first among equals because of their intimate connection with the text of the article and the article's basis on prior work.

The current and continued importance of citation counts has led to the desire to predict how often an article will be cited in order to predict its future importance. Such predictions could be used to decide if an article should be published in one journal vs. another, to flag new research for scrutiny before citation counts have had time to accrue, to assess the development of a young researcher before many counts could have accrued, etc. Many articles have been written on this topic, but there has been very little consistency in their results. Four studies between 2002 and 2012 found that they could predict 14% 1), 20% (2), 60% (3) and 90% (4) of the variance in citation counts a few years after publication based on features available before or shortly after publication. A discrepancy this great requires some explanation!

This article has three goals. The first is to explain the discrepancy in the previous research results. The second is to evaluate the various indicators, a.k.a. features, which were used in the four articles. Features that are predictive of eventual citation counts might be particularly valuable altmetrics that serve as leading indicators of an article's merit. We need to be cautious when comparing results across the studies as they use different scientific domains, make predictions over different time periods, use different statistical methods, obtain results through different procedures, etc. For example, one study measured "newsworthiness" by having readers estimate it; another did so by searching news archives. Both found it to be a notable factor but not necessarily statistically significant. All of this means the results are only loosely comparable. However, we can look within each study to see which features did have significant effects and the relative magnitude of those effects. If the same feature is found to be significant, or not, across all studies then we are fairly safe in drawing conclusions about its utility. The third goal is to see if we can draw conclusions about potentially valuable altmetrics and areas where new altmetrics might be discovered.

 

2 Prediction Features for Pre-Publication Articles

From the time an article is first conceived, features begin to accrue that we can use to predict its future citation counts. This section looks at features available from the inception of the article up to the time where it has been accepted for publication in a journal. The features can be further subdivided into those that apply to the article itself, to the authors of the article, and to the journal which has accepted the article for publication. Those three categories are named as Content, Author, and Venue (4). What we will see is that even before an article is published, we have enough information to make fairly good predictions about its future rate of citations.

2.1 Content

Study Design Factors:

The earliest article we review (1), published in 2002, made the assumption that high quality research would be more heavily cited. They thought about what made high quality research and looked for corresponding features such as sample size, controls, blinding, etc. Sample size and the presence of a control group were found to have some effect, but not to the level of statistical significance. The other factors (blinded, randomized, prospective v. retrospective) were even weaker. The second article (2), published in 2007, also looked at study design factors and found them to have little effect. What they did find, however, was that large studies funded by industry, and with industry-favoring results, were statistically significant predictors of higher rates of citation in the future. These features are understandably important in the medical therapeutic space. Such studies are likely to show drugs and other therapies soon to be available. These factors don't seem likely to generalize to other domains.

Topic:

Unlike the first study, which was confined to emergency medicine, the second study (2) considered the effect of the topic of the article. They found that cardiovascular and oncology articles were more likely to be cited than those on other topics such as anesthesiology, dermatology, endocrinology, gastroenterology, etc. Given the relative death rates of heart disease and cancer to the implications of the other specialties, this seems reasonable. Similarly, the third article (3), published in 2008, found that articles which provided therapeutic information were more cited, as were those which provided original research as opposed to review articles. That study also found that longer articles were cited fewer times, in a weak but statistically significant way. It also found that the more references an article contained, the more likely it was to be cited, although this effect was weak and not significant. The fourth article (4), published in 2012, found a weak effect that the more topics an article covered the higher the number of citations it received.

Table 1 lists the content-based features available before publication which were used in the four studies. Statistically significant values are highlighted. The key things to notice in this table are how few content-based features are significant, and how few of the features are used in multiple studies.

 

Callaham
2002 (1)
Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
# Study Participants 26.5 % 3.1, p=.04 < .001, p=.295
Newsworthiness Score 26 % 13.5, p<.001 .133, p=.161
Control Group 24.3 %
Quality score 15.8 %
Explicit Hypothesis 4.7 %
Prospective v. Retrospective Study 2.7 % 3.6, p=.01 .477, p=.009
Type of Study Participants 2.1 %
Blinded .07 %
Randomized 0 13.4, p=.01
Positive results 0
Industry Funding 19.9, p<.001
Industry Favoring Result 19.4, p<.001
Location of Study 11.9, p=.001
Topic 17.8, p=.001  
Original v. Review article .477, p=.009
# pages -.011, p< .001
Structured abstract -.8, p=.002
# cited references .004, p=.008
Multicenter study .367, p=.014
Therapy v. other article .339, p=.023
Word count of abstract -.0003, p=.658
Semi-structured abstract .071, p=.746
Nation of first author -.037, p=.762
Novelty .059
Topic rank .079
Diversity of topics in article .157

Table 1 - Content-based Features Available Pre-Publication

 

2.2 Author

The effects of the author were not considered in the first study (1).The second study (2) only looked at whether the author byline indicated group authorship. This was found to be the most significant prediction feature in their study! This was a very important result. It indicated that article importance or quality was not easily measured by the presence or absence of some features we might call "good research practice". That realization led to significantly improved prediction accuracy in later work.

The last two papers (3, 4) looked at author-related features in more detail.

Both Lokker (3) and Yan (4) looked at the count of the number of co-authors. Lokker (3) found that count to be a significant factor, but Yan (4) did not. Yan looked at several other author-related features. The Maximum Past Influence of the Author (MPIA) is the citation count for the author's most-cited paper. The Total Past Influence of the Author (TPIA) is the sum of the citation count across the author's body of work. The MPIA was found to be predictive but the TPIA was essentially useless,

A strong result in (4) was the author's rank in citation counts. The citation counts for all the author's works were averaged, and the average counts were sorted to rank the authors. Figure 1, reproduced from (4), shows that being a very highly cited author is predictive of future citation counts. The rich get richer in other words. As can be seen however, this effect is limited and is only strong for authors in the top ranks of citation frequency.

Figure 1 - Citation Counts vs. Rank of Author's Average Citation Count

Figure reproduced from Yan et al (2012) (4). We have sought permission for re-use of this figure.

Considerable attention has been paid to author-related factors in articles beyond the four we review here. (3) provides citations of articles that look at other effects such as nationality, gender, and alphabetic order of the author names.

Table 2 summarizes the effect of the author-based features available before publication. The key thing to notice is that the earliest study made no use of author information, while the latest and most accurate article tried many author-based features.

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
#authors 20.3, p<.001 .087, p<.001 .056
Nation of first author .037, p=.762
Author rank (by citations) .593
h-index .244
MPIA (Max past influence) .585
TPIA (Total Past Influence) .048
Productivity .198
Sociality .249
Authority .155
Versatility .160
Recency .101

Table 2 - Author-Based Features Available Pre-Publication

 

2.3  Venue

The only statistically significant variable found in the first study (1) was the impact factor of the journal in which the article was published. This was an early indication of the power of the venue in determining future citation counts. If we know the journal the article will be published in, we can make more confident predictions about its eventual citation count.

The second study only considered three of the top-line medical journals – JAMA, NEJM, and The Lancet. Nevertheless, they found a significant difference in citation rates between articles in those publications.

The third study did not use the impact factor, as it did not apply to all their sources for content. They discovered other measures that also reflected the article's venue. The strongest are the number of databases that index the journal, and the proportion of articles from the journal which are abstracted within two months by Abstracting and Indexing services and synoptic journals.

Table 3 summarizes the effect of the venue-based features available before publication. Note that no feature is used in more than one study. Curiously, impact factor was the only significant feature found in (1), but it is not used in the later studies. Perhaps the most surprising outcome summarized in this table is the strong effect due to the venues chosen by secondary publication sources like databases, A&I services, and synoptic journals. Given the concerns we all have about infoglut, it is both interesting to see the strength of this effect, and concerning that these effects do not seem to have been featured in any previous altmetric studies. More research in this direction seems justified.

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
Impact factor of publishing journal Strongest factor, relative contribution = 100 %
Accepted for presentation at meeting 5.5 %
Journal 16.3, p < .001
Month of Publication 0.7, p=.5
Proportion of articles abstracted 8.18, p<.001
# databases indexing .039, p<.001
Venue Rank .337
Venue Centrality .049
Max Past Influence of Venue .329
Total Past influence of Venue .023

Table 3 - Venue-Based Features Available Pre-Publication

 

3 Prediction Features for Newly Published Articles

By publication time, we know many facts about the Content, Author, and Venue. In the newly published phase of the article's lifecycle we shift our attention to early perceptions of the quality of the article, and to early indications of the use of the article.

The previous section showed that venues whose articles were frequently selected for abstraction tended to have more highly cited articles. For a single article, the number of times it is abstracted is also a statistically significant predictor (3) which is not available until shortly after publication. That study also showed that articles which were judged "clinically relevant" by the staff of a recommendation service were significantly more likely to have more citations in the future. These results are notable for the same reason as the venue results in the previous section – secondary publication sources have a predictive effect which is not being captured in current altmetrics.

There are many features that could give us early indications of how often articles are being used, or the perceptions that the early users have of them. Those include:

  • Preprint access counts from arXive, etc.
  • General Social Media mentions (Twitter, Facebook, …)
  • Scientific Social Media mentions (Mendeley, del.icio.us, CiteULike, …)
  • Sentiments expressed in early mentions
  • Early download counts from services like ScienceDirect
  • Early citations of the article shown in services like Scopus

These features were not used in the four studies, but there is good reason to believe that these features will be useful in predicting future citation counts. As mentioned in (3):

 “Thirty three per cent of the variance in citation counts of BMJ articles were found to be based on counts of online hits and number of pages (5).”

Table 4 shows the effect of features available shortly after the article is published. The most noticeable aspect of this table is that very few post-publication features were used in the studies other than (3).

 

Callaham 2002 (1) Kulkarni
2007 (2)
Lokker
2008 (3)
Yan
2012 (4)
Newsworthiness Score 26 % 13.5, p<.001 .133, p=.161
Abstracted in evidence based medicine journals .839, p<.001
Clinical Relevance score .418, p<.001
# disciplines rating the article .038, p=.371
Time to article being rated -.009, p=.513
# views or alerts sent -.069, p=.938

Table 4 - Features Available in First Months of Publication

 

4 Prediction Features for Mature Articles

The fourth article (4) looked at temporal factors such as age of the article, as well as regression constants to control the growth and decay of citation rates over time. These results were not strong and other studies did not look at features for mature articles so a summary table is not provided. While none of the studies made significant use of features that become available later in the publication lifecycle, there is no shortage of possibilities. For example, we might look at a Page-Rank like scoring of the influence of the papers citing the particular paper of interest.

Nevertheless, the short story is clear. By the time an article is a few months old, we can make good predictions of its likelihood of future citations - especially for those articles which end up being highly cited. Lokker noted that for the papers with the highest citation counts at two years after publication, "Cited articles in the top half and top third were predicted with 83% and 61% sensitivity and 72% and 82% specificity" (3). In other words, only about 20% of the papers which ended up being highly cited were not predicted to be that way.

 

5 Conclusion

Despite low performance in early studies (14% in 2002), it has become clear over time that it is possible to make good predictions (92% in 2012) of the frequency of future citations. How was this advance achieved? Quite simply, the features being used in the later studies are very different from those used in the earliest ones. The early studies tried to use features around the content, but later work found those to be the weakest while features around the Author and Venue were the most predictive. If we set the power of the Author features to 1.0, the relative power of the Venue and Content features would be about .63 and .25, respectively. We cannot directly compare results across columns, and it is not safe to predict the accuracy any new study might achieve. All of the studies used different domains of literature, predictions over different time periods, different statistical measures, etc. Nevertheless, the pattern seems clear.

It is also interesting, and mildly reassuring, to see that the strongest of these measures operate, to some degree, in a manner independent of each other. Author and Venue are the two most predictive features. However, selecting an article for a journal is usually done in a peer review process that is blind to the identity of the author. Note that this also means these measures are not well-suited for an editorial board to choose articles, since the Venue would be constant and they could not look at the author's publication rank.

In a perfect world, the content of an article would determine its future citation count. We do not, however, have any easily-computed metric for the intrinsic quality and merit of an article. This is where Lokker's results about the importance of secondary sources such as the databases and synoptic journals are most interesting. We see that in the absence of reliable, easily-computed metrics, the subjective human-in-the-loop procedures of peer review, editorial boards, selection for secondary publications, and scientific reputation provide existing mechanisms which fill that void. This provides a potential area of altmetric research to obtain such measures in various fields and compare them with current altmetrics for a variety of purposes.

 

References

(1) Callaham, M., Wears, R.L., Weber, E. (2002) "Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals", JAMA, Vol. 287, pp.2847-50, Available at http://www.ncbi.nlm.nih.gov/pubmed/12038930/
 
(2) Kulkarni, A.V., Busse, J.W., Shams, I. (2007) "Characteristics associated with citation rate of the medical literature", PLOS ONE; 2:e403, Available at http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000403
 
(3) Lokker, C., McKibbon, K.A., McKinlay, R.J., Wilczynski, N.L. and Haynes, R.B. (2008) "Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study", BMJ, Mar 22, 2008; Vol. 336, No. 7645, pp. 655–657. Available online: http://dx.doi.org/10.1136/bmj.39482.526713.BE
 
(4)Yan, R., Huang, C., Tang, J., Zhang, Y., and Li, X. (2012) "To Better Stand on the Shoulder of Giants", JCDL, Available at: http://keg.cs.tsinghua.edu.cn/jietang/publications/JCDL12-Yan-et-al-To-Better-Stand-on-the-Shoulder-of-Giants.pdf
 
(5) Perneger, T.V. (2004) "Relation between online “hit counts” and subsequent citations: prospective study of research papers in the BMJ", BMJ, Vol. 329, No. 7465, pp. 546-547, Available at: http://www.bmj.com/content/329/7465/546
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

A brief history of altmetrics

In this opening piece Mike Thelwall discusses the history of altmetrics and its value and potential uses.

Read more >


“No one can read everything. We rely on filters to make sense of the scholarly literature, but the narrow, traditional filters are being swamped. However, the growth of new, online scholarly tools allows us to make new filters; these altmetrics reflect the broad, rapid impact of scholarship in this burgeoning ecosystem. We call for more tools and research based on altmetrics. (1)

The above manifesto signaled the birth of altmetrics. It grew from the recognition that the social web provided opportunities to create new metrics for the impact or use of scholarly publications. These metrics could help scholars find important articles and perhaps also evaluate the impact of their articles. At the time there was already a field with similar goals, webometrics, which had created a number of indicators from the web for scholars (e.g., 2) and scholarly publications (e.g., 3), including genre-specific indicators, such as syllabus mentions (4). Moreover, article download indicators (e.g., 5) had also been previously investigated. Nevertheless, altmetrics have been radically more successful because of the wide range of social web services that could be harnessed, from Twitter to Mendeley, and because of the ease with which large scale data could be automatically harnessed from the social web through Applications Programming Interfaces (APIs). Academic research with multiple different approaches is needed to evaluate their value, however (6).

 

1 Scholarly use of the social web

Some research has investigated how scholars use social web services, giving insights into the kinds of activities that altmetrics might reflect. In some cases the answers seem straightforward; for example Mendeley is presumably used to store the academic references that users are interested in – perhaps articles that they have previously read or articles that they plan to read. Counts of article “Readers” in Mendeley might therefore be similar to citation counts in the sense that they could reflect the impact of an article. Mendeley has the advantage that its metrics could be available sooner than traditional citations, since there is no publication delay, and its user base is presumably wider than just publishing scientists. Nevertheless, there are biases, such as towards more junior researchers (7).

In comparison to Mendeley, Twitter has a wider user base and a wider range of potential uses. Nevertheless, it seems that only a minority of articles get tweeted – for example, perhaps as few as 10% of PubMed articles in the Web of Science 2010-2012 have been tweeted (8). Scholars seem to use Twitter to cite articles, but sometimes indirectly (9), which may cause problems for automatically harvesting these citations. Moreover, most tweet (link) citations seem to be relatively trivial in the sense of echoing an article title or a brief summary rather than critically engaging with it (10). There are also disciplinary differences in the extent to which Twitter is used and what it is used for (11) and so, as with citations, Twitter altmetrics should not be used to compare between fields. Another problem is that users may also indicate awareness of others’ work by tweeting to them or tweeting about their ideas without citing specific publications (12).

 

2 Evidence for the value of altmetrics

If article level altmetrics are to be useful to help direct potential readers to the more important articles in their field then evidence would be needed to show that articles with higher altmetric scores tended to be, in general, more useful to read. It would be difficult to get direct empirical verification, however, since data from readers about many articles would be needed to cross-reference with altmetric scores. Perhaps the most practical way to demonstrate the value of an altmetric is to show that it can be used to predict the number of future citations to articles, however, since citations are an established indicator of article impact, at least at the statistical level (more cited articles within a field tend to be more highly regarded by scholars, e.g., 13), even though there are many individual examples of articles for which citations are not a good guide to their value. This has been done for tweets to one online medical journal (14) and for citations in research blogs (15). This approach has double value because it shows that altmetric scores are not random but associate with an established (albeit controversial) impact measure and also shows that altmetrics can give earlier evidence of impact than can citation counts.

A second way of getting evidence of the value of altmetrics is to show that their values correlate with citation counts, without demonstrating that the former preceded the latter (of course, correlation does not imply causation and a lack of correlation does not imply worthlessness, but a correlation does imply a relationship with citation impact or at least some of the factors that cause citation impact). This gives some evidence of the validity of altmetrics as an impact indicator but not of their value as an early impact indicator. For example, a study showed that the number of Mendeley readers of articles in the Science and Nature magazines correlated with their citations, but did not prove that Mendeley reader data was available before citation counts (16).

Although the above studies provide good evidence that some altmetrics could have value as impact indicators for a small number of journals, larger scale studies are needed to check additional indicators and a wider range of journals in order to get more general evidence. In response, a large-scale study investigated 11 different altmetrics and up to 208,739 PubMed articles for evidence of a relationship between citations and altmetric scores gathered for 18 months from July 2011. The study found most altmetrics to have a statistically significant positive (Spearman) correlation with citations but one that was too small to be of practical significance (below 0.1). The exceptions were blogs (0.201), research highlights (0.373) and Twitter (-0.190). The reason for the negative correlation for Twitter, and perhaps also for the low correlations in many other cases, could be the rapid increase in citing academic articles in social media, leading to more recent articles being more mentioned even though they were less cited. This suggests that, in most cases, altmetrics have little value for comparing articles published at different points in time, even within the same year. To assess the ability of altmetrics to differentiate between articles published at the same time and in the same journal, the study ran a probabilistic test for up to 1,891 journals per metric to see whether more cited articles tended to have higher altmetric scores, benchmarking against approximately contemporary articles from the same journal. The results gave statistical evidence of an association between higher altmetric scores and citations for most of them for which sufficient data was available (Twitter, Facebook, research highlights, blogs, mainstream media, forums) (17). In summary, it seems that although many altmetrics may have value as indicators of impact, differences over time are critical and so altmetrics need to be normalized in some way in order to allow valid comparisons over time, or they should only be used to compare articles published at the same time (exception: blogs and research highlights).

 

3 Other uses for altmetrics

Altmetrics also have the potential to be used for impact indicators for individual researchers based upon their web presences, although this information should not be used as a primary source of impact information since the extent to which academics possess or exploit social web profiles is variable (e.g., 18; 19; 20). More widely, however, altmetrics should not be used to help evaluate academics for anything important, unless perhaps as complementary measures, because of the ease with which they can be manipulated. In particular, since social websites tend to have no quality control and no formal process to link users to offline identities it would be easy to systematically generate high altmetric scores for any given researcher or set of articles.

A promising future direction for research is to harness altmetrics in new ways in order to gain insights into aspects of research that were previously difficult to get data about, such as the extent to which articles from a field attract readerships from other fields (21) or the value of social media publicity for articles (22). Future research also needs to investigate disciplinary differences in the validity and value of different types of altmetrics. Currently it seems that most articles don’t get mentioned in the social web in a way that can be easily identified for use in altmetrics (e.g., 23), but this may change in the future.

4 References

(1) Priem, J., Taraborelli, D., Groth, P. & Neylon, C. (2010) “Altmetrics: A manifesto”, http://altmetrics.org/manifesto/
(2) Cronin, B., Snyder, H.W., Rosenbaum, H., Martinson, A. & Callahan, E. (1998) “Invoked on the Web”, Journal of the American Society for Information Science, Vol. 49, No. 14, pp. 1319-1328.
(3) Vaughan, L. & Shaw, D. (2003) “Bibliographic and web citations: what is the difference?”, Journal of the American Society for Information Science and Technology, Vol.54, No. 14, pp. 1313-1322.
(4) Kousha, K. & Thelwall, M. (2008) “Assessing the impact of disciplinary research on teaching: An automatic analysis of online syllabuses”, Journal of the American Society for Information Science and Technology, Vol. 59, No. 13, pp. 2060-2069.
(5) Shuai, X., Pepe, A., & Bollen, J. (2012) “How the scientific community reacts to newly submitted preprints: Article downloads, Twitter mentions, and citations”, PLOS ONE, Vol. 7 No. 11, e47523.
(6) Sud, P. & Thelwall, M. (2014) “Evaluating altmetrics”, Scientometrics, Vol. 98, No. 2, pp. 1131-1143.
(7) Mohammadi, E., Thelwall, M., Haustein, S. & Larivière, V. (in press) “Who reads research articles? An altmetrics analysis of Mendeley user categories”, Journal of the Association for Information Science and Technology.
(8) Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M. & Larivière, V. (in press) “Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature”, Journal of the Association for Information Science and Technology.
(9) Priem, J., & Costello, K.L. (2010) “How and why scholars cite on Twitter”, Proceedings of the American Society for Information Science and Technology, Vol. 47, pp. 1-4.
(10) Thelwall, M., Tsou, A., Weingart, S., Holmberg, K. & Haustein, S. (2013) “Tweeting links to academic articles”, Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, Vol. 17, No. 1, paper 1.
(11) Holmberg, K. & Thelwall, M. (in press) “Disciplinary differences in Twitter scholarly communication”, Scientometrics.
(12) Weller, K., Dröge, E. & Puschmann, C. (2011) “Citation analysis in Twitter: Approaches for defining and measuring information flows within tweets during scientific conferences”, In Proceedings of Making Sense of Microposts Workshop (# MSM2011).
(13) Franceschet, M. & Costantini, A. (2011) “The first Italian research assessment exercise: A bibliometric perspective”, Journal of Informetrics, Vol. 5, No. 2, pp. 275-291.
(14) Eysenbach, G. (2011) “Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact”, Journal of Medical Internet Research, Vol.13, No. 4, e123.
(15) Shema, H., Bar-Ilan, J. & Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics”, Journal of the Association for Information Science and Technology, Vol. 65, No. 5, pp. 1018–1027.
(16) Li, X., Thelwall, M. & Giustini, D. (2012) “Validating online reference managers for scholarly impact measurement”, Scientometrics, Vol. 91, No. 2, pp. 461-471.
(17) Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other candidates”, PLOS ONE, Vol. 8, No. 5, e64841. doi:10.1371/journal.pone.0064841
(18) Bar-Ilan, J., Haustein, S., Peters, I., Priem, J., Shema, H. & Terliesner, J. (2012) “Beyond citations: Scholars' visibility on the social Web”, Proceedings of 17th International Conference on Science and Technology Indicators (pp. 98-109), Montréal: Science-Metrix and OST.
(19) Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H. & Terliesner, J. (in press) “Coverage and adoption of altmetrics sources in the bibliometric community”, Scientometrics.
(20) Mas Bleda, A., Thelwall, M., Kousha, K. & Aguillo, I. (2013) “European highly cited scientists’ presence in the social web”, In 14th International Society of Scientometrics and Informetrics Conference (ISSI 2013) (pp. 98-109).
(21) Mohammadi, E. & Thelwall, M. (in press) “Mendeley readership altmetrics for the social sciences and humanities: Research evaluation and knowledge flows”, Journal of the Association for Information Science and Technology.
(22) Allen, H.G., Stanton, T.R., Di Pietro, F. & Moseley, G.L. (2013) “Social media release increases dissemination of original articles in the clinical pain sciences”, PloS ONE, Vol. 8, No. 7, e68914.
(23) Zahedi, Z., Costas, R. & Wouters, P. (in press) “How well developed are Altmetrics? Cross-disciplinary analysis of the presence of “alternative metrics” in scientific publications”, Scientometrics.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

“No one can read everything. We rely on filters to make sense of the scholarly literature, but the narrow, traditional filters are being swamped. However, the growth of new, online scholarly tools allows us to make new filters; these altmetrics reflect the broad, rapid impact of scholarship in this burgeoning ecosystem. We call for more tools and research based on altmetrics. (1)

The above manifesto signaled the birth of altmetrics. It grew from the recognition that the social web provided opportunities to create new metrics for the impact or use of scholarly publications. These metrics could help scholars find important articles and perhaps also evaluate the impact of their articles. At the time there was already a field with similar goals, webometrics, which had created a number of indicators from the web for scholars (e.g., 2) and scholarly publications (e.g., 3), including genre-specific indicators, such as syllabus mentions (4). Moreover, article download indicators (e.g., 5) had also been previously investigated. Nevertheless, altmetrics have been radically more successful because of the wide range of social web services that could be harnessed, from Twitter to Mendeley, and because of the ease with which large scale data could be automatically harnessed from the social web through Applications Programming Interfaces (APIs). Academic research with multiple different approaches is needed to evaluate their value, however (6).

 

1 Scholarly use of the social web

Some research has investigated how scholars use social web services, giving insights into the kinds of activities that altmetrics might reflect. In some cases the answers seem straightforward; for example Mendeley is presumably used to store the academic references that users are interested in – perhaps articles that they have previously read or articles that they plan to read. Counts of article “Readers” in Mendeley might therefore be similar to citation counts in the sense that they could reflect the impact of an article. Mendeley has the advantage that its metrics could be available sooner than traditional citations, since there is no publication delay, and its user base is presumably wider than just publishing scientists. Nevertheless, there are biases, such as towards more junior researchers (7).

In comparison to Mendeley, Twitter has a wider user base and a wider range of potential uses. Nevertheless, it seems that only a minority of articles get tweeted – for example, perhaps as few as 10% of PubMed articles in the Web of Science 2010-2012 have been tweeted (8). Scholars seem to use Twitter to cite articles, but sometimes indirectly (9), which may cause problems for automatically harvesting these citations. Moreover, most tweet (link) citations seem to be relatively trivial in the sense of echoing an article title or a brief summary rather than critically engaging with it (10). There are also disciplinary differences in the extent to which Twitter is used and what it is used for (11) and so, as with citations, Twitter altmetrics should not be used to compare between fields. Another problem is that users may also indicate awareness of others’ work by tweeting to them or tweeting about their ideas without citing specific publications (12).

 

2 Evidence for the value of altmetrics

If article level altmetrics are to be useful to help direct potential readers to the more important articles in their field then evidence would be needed to show that articles with higher altmetric scores tended to be, in general, more useful to read. It would be difficult to get direct empirical verification, however, since data from readers about many articles would be needed to cross-reference with altmetric scores. Perhaps the most practical way to demonstrate the value of an altmetric is to show that it can be used to predict the number of future citations to articles, however, since citations are an established indicator of article impact, at least at the statistical level (more cited articles within a field tend to be more highly regarded by scholars, e.g., 13), even though there are many individual examples of articles for which citations are not a good guide to their value. This has been done for tweets to one online medical journal (14) and for citations in research blogs (15). This approach has double value because it shows that altmetric scores are not random but associate with an established (albeit controversial) impact measure and also shows that altmetrics can give earlier evidence of impact than can citation counts.

A second way of getting evidence of the value of altmetrics is to show that their values correlate with citation counts, without demonstrating that the former preceded the latter (of course, correlation does not imply causation and a lack of correlation does not imply worthlessness, but a correlation does imply a relationship with citation impact or at least some of the factors that cause citation impact). This gives some evidence of the validity of altmetrics as an impact indicator but not of their value as an early impact indicator. For example, a study showed that the number of Mendeley readers of articles in the Science and Nature magazines correlated with their citations, but did not prove that Mendeley reader data was available before citation counts (16).

Although the above studies provide good evidence that some altmetrics could have value as impact indicators for a small number of journals, larger scale studies are needed to check additional indicators and a wider range of journals in order to get more general evidence. In response, a large-scale study investigated 11 different altmetrics and up to 208,739 PubMed articles for evidence of a relationship between citations and altmetric scores gathered for 18 months from July 2011. The study found most altmetrics to have a statistically significant positive (Spearman) correlation with citations but one that was too small to be of practical significance (below 0.1). The exceptions were blogs (0.201), research highlights (0.373) and Twitter (-0.190). The reason for the negative correlation for Twitter, and perhaps also for the low correlations in many other cases, could be the rapid increase in citing academic articles in social media, leading to more recent articles being more mentioned even though they were less cited. This suggests that, in most cases, altmetrics have little value for comparing articles published at different points in time, even within the same year. To assess the ability of altmetrics to differentiate between articles published at the same time and in the same journal, the study ran a probabilistic test for up to 1,891 journals per metric to see whether more cited articles tended to have higher altmetric scores, benchmarking against approximately contemporary articles from the same journal. The results gave statistical evidence of an association between higher altmetric scores and citations for most of them for which sufficient data was available (Twitter, Facebook, research highlights, blogs, mainstream media, forums) (17). In summary, it seems that although many altmetrics may have value as indicators of impact, differences over time are critical and so altmetrics need to be normalized in some way in order to allow valid comparisons over time, or they should only be used to compare articles published at the same time (exception: blogs and research highlights).

 

3 Other uses for altmetrics

Altmetrics also have the potential to be used for impact indicators for individual researchers based upon their web presences, although this information should not be used as a primary source of impact information since the extent to which academics possess or exploit social web profiles is variable (e.g., 18; 19; 20). More widely, however, altmetrics should not be used to help evaluate academics for anything important, unless perhaps as complementary measures, because of the ease with which they can be manipulated. In particular, since social websites tend to have no quality control and no formal process to link users to offline identities it would be easy to systematically generate high altmetric scores for any given researcher or set of articles.

A promising future direction for research is to harness altmetrics in new ways in order to gain insights into aspects of research that were previously difficult to get data about, such as the extent to which articles from a field attract readerships from other fields (21) or the value of social media publicity for articles (22). Future research also needs to investigate disciplinary differences in the validity and value of different types of altmetrics. Currently it seems that most articles don’t get mentioned in the social web in a way that can be easily identified for use in altmetrics (e.g., 23), but this may change in the future.

4 References

(1) Priem, J., Taraborelli, D., Groth, P. & Neylon, C. (2010) “Altmetrics: A manifesto”, http://altmetrics.org/manifesto/
(2) Cronin, B., Snyder, H.W., Rosenbaum, H., Martinson, A. & Callahan, E. (1998) “Invoked on the Web”, Journal of the American Society for Information Science, Vol. 49, No. 14, pp. 1319-1328.
(3) Vaughan, L. & Shaw, D. (2003) “Bibliographic and web citations: what is the difference?”, Journal of the American Society for Information Science and Technology, Vol.54, No. 14, pp. 1313-1322.
(4) Kousha, K. & Thelwall, M. (2008) “Assessing the impact of disciplinary research on teaching: An automatic analysis of online syllabuses”, Journal of the American Society for Information Science and Technology, Vol. 59, No. 13, pp. 2060-2069.
(5) Shuai, X., Pepe, A., & Bollen, J. (2012) “How the scientific community reacts to newly submitted preprints: Article downloads, Twitter mentions, and citations”, PLOS ONE, Vol. 7 No. 11, e47523.
(6) Sud, P. & Thelwall, M. (2014) “Evaluating altmetrics”, Scientometrics, Vol. 98, No. 2, pp. 1131-1143.
(7) Mohammadi, E., Thelwall, M., Haustein, S. & Larivière, V. (in press) “Who reads research articles? An altmetrics analysis of Mendeley user categories”, Journal of the Association for Information Science and Technology.
(8) Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M. & Larivière, V. (in press) “Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature”, Journal of the Association for Information Science and Technology.
(9) Priem, J., & Costello, K.L. (2010) “How and why scholars cite on Twitter”, Proceedings of the American Society for Information Science and Technology, Vol. 47, pp. 1-4.
(10) Thelwall, M., Tsou, A., Weingart, S., Holmberg, K. & Haustein, S. (2013) “Tweeting links to academic articles”, Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, Vol. 17, No. 1, paper 1.
(11) Holmberg, K. & Thelwall, M. (in press) “Disciplinary differences in Twitter scholarly communication”, Scientometrics.
(12) Weller, K., Dröge, E. & Puschmann, C. (2011) “Citation analysis in Twitter: Approaches for defining and measuring information flows within tweets during scientific conferences”, In Proceedings of Making Sense of Microposts Workshop (# MSM2011).
(13) Franceschet, M. & Costantini, A. (2011) “The first Italian research assessment exercise: A bibliometric perspective”, Journal of Informetrics, Vol. 5, No. 2, pp. 275-291.
(14) Eysenbach, G. (2011) “Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact”, Journal of Medical Internet Research, Vol.13, No. 4, e123.
(15) Shema, H., Bar-Ilan, J. & Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics”, Journal of the Association for Information Science and Technology, Vol. 65, No. 5, pp. 1018–1027.
(16) Li, X., Thelwall, M. & Giustini, D. (2012) “Validating online reference managers for scholarly impact measurement”, Scientometrics, Vol. 91, No. 2, pp. 461-471.
(17) Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other candidates”, PLOS ONE, Vol. 8, No. 5, e64841. doi:10.1371/journal.pone.0064841
(18) Bar-Ilan, J., Haustein, S., Peters, I., Priem, J., Shema, H. & Terliesner, J. (2012) “Beyond citations: Scholars' visibility on the social Web”, Proceedings of 17th International Conference on Science and Technology Indicators (pp. 98-109), Montréal: Science-Metrix and OST.
(19) Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H. & Terliesner, J. (in press) “Coverage and adoption of altmetrics sources in the bibliometric community”, Scientometrics.
(20) Mas Bleda, A., Thelwall, M., Kousha, K. & Aguillo, I. (2013) “European highly cited scientists’ presence in the social web”, In 14th International Society of Scientometrics and Informetrics Conference (ISSI 2013) (pp. 98-109).
(21) Mohammadi, E. & Thelwall, M. (in press) “Mendeley readership altmetrics for the social sciences and humanities: Research evaluation and knowledge flows”, Journal of the Association for Information Science and Technology.
(22) Allen, H.G., Stanton, T.R., Di Pietro, F. & Moseley, G.L. (2013) “Social media release increases dissemination of original articles in the clinical pain sciences”, PloS ONE, Vol. 8, No. 7, e68914.
(23) Zahedi, Z., Costas, R. & Wouters, P. (in press) “How well developed are Altmetrics? Cross-disciplinary analysis of the presence of “alternative metrics” in scientific publications”, Scientometrics.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Celebrating Rare Disease Day – A look into Rare Disease research

In honour of Rare Disease Day, an international advocacy day to help raise awareness for rare diseases, Iris Kisjes investigated publication trends in the field using SciVal. In this piece she highlights some of the key institutions that are active in research on rare diseases.

Read more >


28th February is Rare Disease Day, which is an international advocacy day to help raise awareness with the public about rare diseases, the challenges encountered by those affected, the importance of research to develop diagnostics and treatments, and the impact of these diseases on patients' lives.

Rare Disease Day was first observed in Europe in 2008. It was established by EURORDIS, the European Rare Disease Organization. In 2009, NORD partnered with EURORDIS in this initiative and sponsored Rare Disease Day in the United States. Since then, the concept has continued to expand beyond the US and Europe. In 2013, more than 70 countries participated.

Rare diseases collectively affect millions of people of all ages globally, of which approximately 18-25 million Americans. They are often serious and life altering; many are life threatening or fatal. In Europe a disease is considered rare when it affects no more than 5 individuals among 10,000 persons, whereas the US considers a disease to be rare when it affects less than 200,000 Americans. Since each of the roughly 7000 rare diseases affect only a relatively small population, it can be challenging to develop drugs and medical devices to prevent, diagnose, and treat these conditions. In general there is a lack of understanding of the underlying molecular mechanisms or even the cause or of many rare diseases. Hence, countries across the globe should share experiences and work together to help address these challenges successfully.

There are many challenges faced by these types of diseases, as they are often not well defined or characterized. Rarity also means that recruitment for trials is usually quite difficult, study populations are widely dispersed, and there are few expert centers for diagnosis, management and research. This is often accompanied by a lack of high-quality evidence available to guide treatment.

At Research Trends we were curious to learn more about the subject and to highlight some of the key institutions and authors that are contributing to the field of rare disease. To do this we examined the field by looking at publication trends using SciVal. To begin with we created a research area in SciVal based on the keyword search ‘rare disease’. The keyword search searches through titles, abstracts and keywords of research papers within Scopus.

Setting up a research area in SciVal is fairly simple and can be completed in three easy steps. Figure 1 shows that for this search query a set of 12,818 publications were published between 1996 to present, where the US, Germany, Japan and France were the most prolific.

Rare Diseases Fig 1

Figure 1 - Defining a research area in SciVal for 'Rare Disease' research

SciVal is a ready-to-use tool to analyze the world of research. It is based on Scopus data and primarily developed for research organizations to help establish, execute and evaluate their strategies within the context of their peers (through benchmarking) and collaborators (through collaboration networks). The solution also allows users to set up research areas to analyze contributors within the field and their corresponding publication and citation statistics.

The SciVal research area focuses its analysis on the last five years of publications for which we found the pool of research papers in this area was is fairly small. Over the course of 5 years (2009-2013) close to 5900 publications containing ‘rare disease’ were published around the globe, of which less than 20% originated from the US, see Table 1 (data date stamp 21 January 2014). One of the reasons why this set of research papers is small can be contributed to the simplicity of the search terms used. In the approach presented in this paper relevant articles were selected on the basis of the occurrence of the term ‘rare disease’ without including the names of the 7000 rare diseases themselves. We can therefore assume that papers related to many rare diseases would not be included in this set as not all papers related to specific diseases will include ‘rare disease’ in their abstract, keywords or title. In a follow-up study we could look into including search terms related to particular rare diseases to provide a more complete picture of this research field.

Despite this small sample set, we took a look at a number of institutions, and a number of authors that were the most prolific in this area. When looking at the three most prolific institutions around the globe they all originate from either France or the US, bar one from Germany. Hence, this article focuses on the two most prolific from both countries, namely Université Paris 5 and Harvard, see Table 1 and Table 2.

Country/Region/Institute Publications
Worldwide

5,879

Europe

2,679

United States

905

Germany

518

France

495

Université Paris 5

77

Harvard University

61

University of Munich

45

Table 1 – Scholarly Output for ‘Rare Disease’ research in 2009-2013 at different levels

Institution Country

Publications in this Research Area

Publications in this Research Area (growth %)

Citations

Authors

Citations per Publication

Université Paris 5 France

77

23.1

740

11

9.6

INSERM France

71

133.3

420

4

5.9

Harvard University United States

61

0

726

10

11.9

Université Paris 6 France

49

300

312

22

6.4

University of Munich Germany

45

42.9

206

3

4.6

Table 2 – Most Prolific Institutions for ‘Rare Disease’ research in 2009-2013

 

Insitutional Collaboration maps for Harvard, Paris 5 and Munich

Using SciVal we took a closer look at the collaboration patterns of three individual intitutions, Harvard University, Université Paris 5 and University of Munich in ‘rare disease’ research.  SciVal allows you to drill down from a worldwide, to a regional and right down to a country level view of the institutional collaborations.

Rare Diseases Fig 2

Figure 2 - Worldwide collaboration by Harvard University in ‘Rare Disease’ research

As can be seen from Figure 2, Harvard University collaborated with 171 institutions worldwide on a total of 39 co-authored publications of its total of 61 publications, with the majority of the collaborations taking place with authors within the US and Europe. The average number of insitutions per paper are 4,38. You can also see here that out of the 61 articles 39 are co-authored with other North American research organizations. In fact, Harvard’s top 20 collaborating institutions on ‘rare disease’ research are mainly from the US, only three were international cross-border collaborations, namely with Canada, UK and Switzerland. When one looks at the general collaboration trends exhibited by Harvard they seem inline with the trends exhibited within the area of ‘rare disease’ where the top collaborations are all national institutions with few international collaborating institutions from abroad featuring in the top 40.

Rare Diseases Fig 3

Figure 3 - Worldwide collaboration Université Paris 5 in ‘Rare Disease’ research

You can see from Figure 3 that Université Paris 5 collaborated with 215 institutions worldwide on 62 co-authored publications out of its 77 total publications, resulting in an average of 3,48 institutions involved in each paper. From Figure 4 you can see that the majority of collaborating institutions are from France (72). In fact, there are only three non-French institutional collaborators amongst their top 20 rare disease collaborators, namely Canada, UK and Germany.

Rare Diseases Fig 4

Figure 4 - European collaboration by Université Paris 5 in ‘Rare Disease’ research

 

Rare Diseases Fig 5

Figure 5 - Worldwide collaboration by University of Munich in 'rare disease' research

Here again in Figure 5 we find there are a large number of institutions that work on a number of papers, 73 institutions collaborated on 18 articles, with an average of 4 institutions per paper. The University of Munich enters into the most cross-boarder collaborations with half of its top 20 institutions coming from abroad, namely three US, two UK, two Dutch, one Canadian and one institute from the Czech Republic.  It seems logical for collaboration to play a central role in ‘rare disease’ research, though it may be expected that cross-border collaborations would be most important to effectively propel the research forward, mainly due to the small patient numbers in each country.

The overal results show a large number of institutions collaborting on one paper, and it seems that Munich in general is a more internationally focused research organization from the three universitites investigated, see Table 3 for the general  collaboration trends for these universities.

University

Single author publications
(%)

Institutional collaboration
(%)

National collaboration
(%)

International collaboration
(%)

Harvard University

9.6

20.8

31.1

38.5

Université Paris 5

8.5

13.3

40.2

37.9

University of Munich

8.1

24.1

20.7

47.1

Table 3: General collaboration trends for the three universities investigated

 

Highlighting the Most Prolific Authors in SciVal

By looking at the most prolific authors in the set, in addition to clicking though to the abstracts in Scopus from SciVal, we were able to get a better idea of the set of research papers we were looking at.

The most prolific authors in the dataset for ‘rare disease’ between 2009-2013 are: Dr. Domenica Taruscio, MD. Director at the National Centre for Rare Diseases, Instituto Superiore di Sanita, Rome, Italy; Dr. Steven Simoens, Katholieke Universiteit Leuven, Department of Pharmaceutical and Pharmacological Sciences, Leuven, Belgium; Dr. Stephen Groft, Director National Institutes of Health, Office of Rare Disease Research (see table 4).

Name

Publications in this Research Area

Citations in this Research Area

Citations per Publication

Taruscio, D.

16

37

2.3

Simoens, S.

14

50

3.6

Groft, S.C.

11

53

4.8

Table 4 – Most Productive Authors in ‘Rare Disease’ research in 2009-2013

Dr. Domenica Taruscio MD. focuses her research on setting systems in place that can, firstly, help train and inform clinicians to make the right diagnosis and secondly, improve the dissemination of information around symptomatic treatments. She has just spent the last 30 months on a feasibility study funded by the European Commission (DG Sanco) addressing regulatory, ethical and technical issues associated with the registration of rare disease patients and with the creation of an EU platform for the collection of data on rare disease patients and their communication among qualified users.

Dr. Steven Simoens, on the other hand, does not focus his research especially on rare disease themselves. He works within Pharmaceutical Sciences with a keen interest in pharmaco-economics and ethics which is where rare disease seem to come up in his research.  His publication rate is very impressive with about 15 papers per year. ‘Rare disease’ research is a topic of concern for this field of interest as there are debates on the economic rationale behind society supporting any part of the rare disease value chain. There was a debate in the Netherlands, for example, in 2012 related to the cost of providing medication to patients with Pompe’s disease. Medication for these patients cost between 400-700,000 Euros per patient per year. The economist Dr. Marc Pomp introduced the concept of the quality-adjusted life year (QALY), where the costs of medical treatment are placed in relation to the quality of life in those years for the patient. The level considered acceptable was set at 50,000 Euro per year, far below the cost of treating a Pompe patient, while stopping their treatment would would most likely cause them to die (1).

The third most prolific researcher whithin our sample, Dr. Stephen Groft, received the life time achievement award for his nearly 30 years of service and commitment to advancing research and treatments for the millions of people afflicted by rare and genetic diseases. He is one of the original pioneers in the rare disease arena and is recognized globally as a leader in building collaborative relationships to improve patient treatment and care. Pham. D. Groft retired on February 8th this year. He was praised for giving thousands of rare disease patients and their families renewed hope and a collective voice.  One of the organizations he set up was the National Center for Advancing Translational Science (NCATS). You can read more about his work at: http://www.ncats.nih.gov/news-and-events/features/groft.html

Each of these three individuals look at rare disease in very different ways, though all are in some way interested in the management of the research field, suggesting that our keyword search did in fact omit the research of ‘rare disease’ from the sample set. However, this also shows how much attention needs  to be placed on attracting the public’s attention for ‘rare disease’ and on building global awareness and a collective solidarity to support the population and their families affected by these rare and often severe diseases.

 

References

(1) NOS, Advies: Stop met Dure medicijnen, R. vd Brink and H. vd Parre,  (Dutch) 29 July 2012, http://nos.nl/artikel/400207-advies-stop-met-dure-medicijnen.html

 

Related links

ORDR - Office of Rare Diseases Research - http://rarediseases.info.nih.gov/about-ordr/pages/30/about-ordr
NORD - The National Organization for Rare Disorders (NORD) - http://www.rarediseases.org/
Eurordis – The voice of rare diseases Europe - http://www.eurordis.org/
JPA Japanese patients association - http://www.nanbyo.jp/
European Platform for Rare Diseases Registries - http://www.epirare.eu/
 

The next International Conferences for Rare Diseases and Orphan Drugs(ICORD) 2014 Annual Meeting on October 8-10, 2014 in The Netherlands. More information will follow: http://icord.se

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

28th February is Rare Disease Day, which is an international advocacy day to help raise awareness with the public about rare diseases, the challenges encountered by those affected, the importance of research to develop diagnostics and treatments, and the impact of these diseases on patients' lives.

Rare Disease Day was first observed in Europe in 2008. It was established by EURORDIS, the European Rare Disease Organization. In 2009, NORD partnered with EURORDIS in this initiative and sponsored Rare Disease Day in the United States. Since then, the concept has continued to expand beyond the US and Europe. In 2013, more than 70 countries participated.

Rare diseases collectively affect millions of people of all ages globally, of which approximately 18-25 million Americans. They are often serious and life altering; many are life threatening or fatal. In Europe a disease is considered rare when it affects no more than 5 individuals among 10,000 persons, whereas the US considers a disease to be rare when it affects less than 200,000 Americans. Since each of the roughly 7000 rare diseases affect only a relatively small population, it can be challenging to develop drugs and medical devices to prevent, diagnose, and treat these conditions. In general there is a lack of understanding of the underlying molecular mechanisms or even the cause or of many rare diseases. Hence, countries across the globe should share experiences and work together to help address these challenges successfully.

There are many challenges faced by these types of diseases, as they are often not well defined or characterized. Rarity also means that recruitment for trials is usually quite difficult, study populations are widely dispersed, and there are few expert centers for diagnosis, management and research. This is often accompanied by a lack of high-quality evidence available to guide treatment.

At Research Trends we were curious to learn more about the subject and to highlight some of the key institutions and authors that are contributing to the field of rare disease. To do this we examined the field by looking at publication trends using SciVal. To begin with we created a research area in SciVal based on the keyword search ‘rare disease’. The keyword search searches through titles, abstracts and keywords of research papers within Scopus.

Setting up a research area in SciVal is fairly simple and can be completed in three easy steps. Figure 1 shows that for this search query a set of 12,818 publications were published between 1996 to present, where the US, Germany, Japan and France were the most prolific.

Rare Diseases Fig 1

Figure 1 - Defining a research area in SciVal for 'Rare Disease' research

SciVal is a ready-to-use tool to analyze the world of research. It is based on Scopus data and primarily developed for research organizations to help establish, execute and evaluate their strategies within the context of their peers (through benchmarking) and collaborators (through collaboration networks). The solution also allows users to set up research areas to analyze contributors within the field and their corresponding publication and citation statistics.

The SciVal research area focuses its analysis on the last five years of publications for which we found the pool of research papers in this area was is fairly small. Over the course of 5 years (2009-2013) close to 5900 publications containing ‘rare disease’ were published around the globe, of which less than 20% originated from the US, see Table 1 (data date stamp 21 January 2014). One of the reasons why this set of research papers is small can be contributed to the simplicity of the search terms used. In the approach presented in this paper relevant articles were selected on the basis of the occurrence of the term ‘rare disease’ without including the names of the 7000 rare diseases themselves. We can therefore assume that papers related to many rare diseases would not be included in this set as not all papers related to specific diseases will include ‘rare disease’ in their abstract, keywords or title. In a follow-up study we could look into including search terms related to particular rare diseases to provide a more complete picture of this research field.

Despite this small sample set, we took a look at a number of institutions, and a number of authors that were the most prolific in this area. When looking at the three most prolific institutions around the globe they all originate from either France or the US, bar one from Germany. Hence, this article focuses on the two most prolific from both countries, namely Université Paris 5 and Harvard, see Table 1 and Table 2.

Country/Region/Institute Publications
Worldwide

5,879

Europe

2,679

United States

905

Germany

518

France

495

Université Paris 5

77

Harvard University

61

University of Munich

45

Table 1 – Scholarly Output for ‘Rare Disease’ research in 2009-2013 at different levels

Institution Country

Publications in this Research Area

Publications in this Research Area (growth %)

Citations

Authors

Citations per Publication

Université Paris 5 France

77

23.1

740

11

9.6

INSERM France

71

133.3

420

4

5.9

Harvard University United States

61

0

726

10

11.9

Université Paris 6 France

49

300

312

22

6.4

University of Munich Germany

45

42.9

206

3

4.6

Table 2 – Most Prolific Institutions for ‘Rare Disease’ research in 2009-2013

 

Insitutional Collaboration maps for Harvard, Paris 5 and Munich

Using SciVal we took a closer look at the collaboration patterns of three individual intitutions, Harvard University, Université Paris 5 and University of Munich in ‘rare disease’ research.  SciVal allows you to drill down from a worldwide, to a regional and right down to a country level view of the institutional collaborations.

Rare Diseases Fig 2

Figure 2 - Worldwide collaboration by Harvard University in ‘Rare Disease’ research

As can be seen from Figure 2, Harvard University collaborated with 171 institutions worldwide on a total of 39 co-authored publications of its total of 61 publications, with the majority of the collaborations taking place with authors within the US and Europe. The average number of insitutions per paper are 4,38. You can also see here that out of the 61 articles 39 are co-authored with other North American research organizations. In fact, Harvard’s top 20 collaborating institutions on ‘rare disease’ research are mainly from the US, only three were international cross-border collaborations, namely with Canada, UK and Switzerland. When one looks at the general collaboration trends exhibited by Harvard they seem inline with the trends exhibited within the area of ‘rare disease’ where the top collaborations are all national institutions with few international collaborating institutions from abroad featuring in the top 40.

Rare Diseases Fig 3

Figure 3 - Worldwide collaboration Université Paris 5 in ‘Rare Disease’ research

You can see from Figure 3 that Université Paris 5 collaborated with 215 institutions worldwide on 62 co-authored publications out of its 77 total publications, resulting in an average of 3,48 institutions involved in each paper. From Figure 4 you can see that the majority of collaborating institutions are from France (72). In fact, there are only three non-French institutional collaborators amongst their top 20 rare disease collaborators, namely Canada, UK and Germany.

Rare Diseases Fig 4

Figure 4 - European collaboration by Université Paris 5 in ‘Rare Disease’ research

 

Rare Diseases Fig 5

Figure 5 - Worldwide collaboration by University of Munich in 'rare disease' research

Here again in Figure 5 we find there are a large number of institutions that work on a number of papers, 73 institutions collaborated on 18 articles, with an average of 4 institutions per paper. The University of Munich enters into the most cross-boarder collaborations with half of its top 20 institutions coming from abroad, namely three US, two UK, two Dutch, one Canadian and one institute from the Czech Republic.  It seems logical for collaboration to play a central role in ‘rare disease’ research, though it may be expected that cross-border collaborations would be most important to effectively propel the research forward, mainly due to the small patient numbers in each country.

The overal results show a large number of institutions collaborting on one paper, and it seems that Munich in general is a more internationally focused research organization from the three universitites investigated, see Table 3 for the general  collaboration trends for these universities.

University

Single author publications
(%)

Institutional collaboration
(%)

National collaboration
(%)

International collaboration
(%)

Harvard University

9.6

20.8

31.1

38.5

Université Paris 5

8.5

13.3

40.2

37.9

University of Munich

8.1

24.1

20.7

47.1

Table 3: General collaboration trends for the three universities investigated

 

Highlighting the Most Prolific Authors in SciVal

By looking at the most prolific authors in the set, in addition to clicking though to the abstracts in Scopus from SciVal, we were able to get a better idea of the set of research papers we were looking at.

The most prolific authors in the dataset for ‘rare disease’ between 2009-2013 are: Dr. Domenica Taruscio, MD. Director at the National Centre for Rare Diseases, Instituto Superiore di Sanita, Rome, Italy; Dr. Steven Simoens, Katholieke Universiteit Leuven, Department of Pharmaceutical and Pharmacological Sciences, Leuven, Belgium; Dr. Stephen Groft, Director National Institutes of Health, Office of Rare Disease Research (see table 4).

Name

Publications in this Research Area

Citations in this Research Area

Citations per Publication

Taruscio, D.

16

37

2.3

Simoens, S.

14

50

3.6

Groft, S.C.

11

53

4.8

Table 4 – Most Productive Authors in ‘Rare Disease’ research in 2009-2013

Dr. Domenica Taruscio MD. focuses her research on setting systems in place that can, firstly, help train and inform clinicians to make the right diagnosis and secondly, improve the dissemination of information around symptomatic treatments. She has just spent the last 30 months on a feasibility study funded by the European Commission (DG Sanco) addressing regulatory, ethical and technical issues associated with the registration of rare disease patients and with the creation of an EU platform for the collection of data on rare disease patients and their communication among qualified users.

Dr. Steven Simoens, on the other hand, does not focus his research especially on rare disease themselves. He works within Pharmaceutical Sciences with a keen interest in pharmaco-economics and ethics which is where rare disease seem to come up in his research.  His publication rate is very impressive with about 15 papers per year. ‘Rare disease’ research is a topic of concern for this field of interest as there are debates on the economic rationale behind society supporting any part of the rare disease value chain. There was a debate in the Netherlands, for example, in 2012 related to the cost of providing medication to patients with Pompe’s disease. Medication for these patients cost between 400-700,000 Euros per patient per year. The economist Dr. Marc Pomp introduced the concept of the quality-adjusted life year (QALY), where the costs of medical treatment are placed in relation to the quality of life in those years for the patient. The level considered acceptable was set at 50,000 Euro per year, far below the cost of treating a Pompe patient, while stopping their treatment would would most likely cause them to die (1).

The third most prolific researcher whithin our sample, Dr. Stephen Groft, received the life time achievement award for his nearly 30 years of service and commitment to advancing research and treatments for the millions of people afflicted by rare and genetic diseases. He is one of the original pioneers in the rare disease arena and is recognized globally as a leader in building collaborative relationships to improve patient treatment and care. Pham. D. Groft retired on February 8th this year. He was praised for giving thousands of rare disease patients and their families renewed hope and a collective voice.  One of the organizations he set up was the National Center for Advancing Translational Science (NCATS). You can read more about his work at: http://www.ncats.nih.gov/news-and-events/features/groft.html

Each of these three individuals look at rare disease in very different ways, though all are in some way interested in the management of the research field, suggesting that our keyword search did in fact omit the research of ‘rare disease’ from the sample set. However, this also shows how much attention needs  to be placed on attracting the public’s attention for ‘rare disease’ and on building global awareness and a collective solidarity to support the population and their families affected by these rare and often severe diseases.

 

References

(1) NOS, Advies: Stop met Dure medicijnen, R. vd Brink and H. vd Parre,  (Dutch) 29 July 2012, http://nos.nl/artikel/400207-advies-stop-met-dure-medicijnen.html

 

Related links

ORDR - Office of Rare Diseases Research - http://rarediseases.info.nih.gov/about-ordr/pages/30/about-ordr
NORD - The National Organization for Rare Disorders (NORD) - http://www.rarediseases.org/
Eurordis – The voice of rare diseases Europe - http://www.eurordis.org/
JPA Japanese patients association - http://www.nanbyo.jp/
European Platform for Rare Diseases Registries - http://www.epirare.eu/
 

The next International Conferences for Rare Diseases and Orphan Drugs(ICORD) 2014 Annual Meeting on October 8-10, 2014 in The Netherlands. More information will follow: http://icord.se

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Party papers or policy discussions: an examination of highly shared papers using altmetric data

Which scientific stories are most shared on social media networks such as Twitter and Facebook? And do articles attracting social media attention also get the attention of scholars and the mass media? In this article, Dr. Andrew Plume and Mike Taylor provide some answers, but also urge for caution when characterizing altmetric indicators.

Read more >


Which scientific stories are most shared on social media networks such as Twitter and Facebook?

Research on human health and social issues are often perceived as being the most shared scientific stories on social media networks such as Twitter and Facebook and – given their mainstream appeal – are often suggested to dominate the popular discussion around scholarly research online, but skeptics, such as David Calquhoun argue for their irrelevance: “Scientific works get tweeted about mostly because they have titles that contain buzzwords, not because they represent great science” (1).

So which is it to be? And do articles attracting social media attention also get the attention of scholars and the mass media? In this article, we seek to provide an approach to answering these questions.

With the rise of online scholarly publishing and the concomitant rise in the desire to create indicators of online attention to research articles and related outputs have come a number of providers of article-level data. A leading commercial provider of such data - collectively known as ‘altmetrics’ - is Altmetric.com, which tracks a variety of different indicators in four broad groups: Social Activity (e.g. Tweets and Facebook mentions), Mass Media (e.g. mentions on news sites such as BBC and CNN), Scholarly Commentary (e.g. mentions in scientific blogs), and Scholarly Activity (e.g. articles in reader libraries such as Mendeley). The overall collection and analysis of these references are brought together under the label “altmetrics”.

In terms of the volume of online mentions of scholarly articles, Twitter and other social networks provide by far the largest number of data points. However, given Twitter’s broad user base (the majority being non-academics) and limited information content (being restricted to 140 characters per tweet), other indicators may be more significant in terms of understanding scholarly usage (2). For example, Mendeley and CiteULike are examples of sharing and collaboration platforms used predominantly by researchers, while the mass media and scientific blogs tracked by Altmetric.com are written by professional science journalists or researchers themselves.

 

Methodology

Data were collected from the Altmetric.com API over four months ending January 17th, 2014. On this date, the latest altmetric indicator data for all papers published in a selection of journals in 2013 with any online mentions captured by Altmetric.com were downloaded for analysis; in total, 13,793 articles with at least one altmetric indicator datapoint were included in this study. Please note, the actual Journals monitored are detailed in the raw dataset, which is published on Figshare.

The Altmetric.com data includes counts of online attention at article level from across a variety of different data sources. In order to simplify data analysis, we aggregated data counts into the four classes as defined above: Social Activity, Mass Media, Scholarly Commentary, and Scholarly Activity. For each class, articles were assigned to predefined percentile ranges (cohorts) based on the frequency of online mentions (see Table 1).

Cohorts Number of articles included

0.5%

69

1%

138

2.5%

347

5%

691

7.5%

1,045

10%

1,384

15%

2,095

20%

2,775

25%

3,539

30%

4,332

100%

13,793

Table 1 - Cohorts of articles based on the frequency of online attention within each class.

For example, the 69 papers comprising the top 0.5% of social activity comprise 91,470 social actions, 445 mass media mentions, 540 scholarly comments and 1,571 scholarly actions, whereas the top 69 papers comprising the top 0.5% of mass media activity comprise 2,638 mass media mentions, 16,221 social actions, 779 scholarly comments and 4,856 scholarly actions.

 

 Analysis

 Headline-grabbers: Which articles got most social media attention in 2013?

Of the 69 articles belonging to the 0.5% cohort in the Social Activity class (i.e. those articles most frequently mentioned in social media such as Twitter and Facebook, for example), just 8 of them are full-length articles reporting the results of original research. The remainder are typically editorial features or news items from leading weekly journals such as The Lancet, BMJ and Nature; see table 2 for the complete list. The original research articles cover topics in the popular consciousness including climate change, human health and diet, and online information and privacy: intuitively, the sort of articles one might expect to see attracting broad popular attention online. However, one article appears to have a less obvious popular slant (the Nature letter “Attractive photons in a quantum nonlinear medium”) but closer examination shows that it describes a novel technique for forcing photons to interact in a quantum nonlinear medium which may have applications in quantum processing, where the ability to have photons ‘see’ each other could overcome present technological limitations.

The remaining 61 articles (almost exclusively news and editorial features about original research reported elsewhere) cover a variety of topics including several on topics close to the heart of the academy: research careers, science funding, the future of higher education and scholarly publishing. The preponderance of items in this group from Nature (primarily the Nature News and Nature News Feature sections of the publication) suggest that Social activity may be more likely to reflect attention to short journalistic versions of current research results rather than the original research articles themselves; a worthy follow-up to this study would be to track the variation in performance across altmetric indicator classes of an original research article and the current awareness ‘news-worthy’ version of the same research.

 

Journal Article title DOI
Nature Cerebral organoids model human brain development and microcephaly 10.1038/nature12517
Nature Comment Climate science: Vast costs of Arctic change 10.1038/499401a
Nature Comment Neuroscience: My life with Parkinson's 10.1038/503029a
Nature Editorial Nuclear error 10.1038/501005b
Nature Editorial Science for all 10.1038/495005a
Nature Letter No increase in global temperature variability despite changing regional patterns 10.1038/nature12310
Nature Letter Attractive photons in a quantum nonlinear medium 10.1038/nature12512
Nature News Brazilian citation scheme outed 10.1038/500510a
Nature News Half of 2011 papers now free to read 10.1038/500386a
Nature News World's slowest-moving drop caught on camera at last 10.1038/nature.2013.13418
Nature News Genetically modified crops pass benefits to weeds 10.1038/nature.2013.13517
Nature News NSF cancels political-science grant cycle 10.1038/nature.2013.13501
Nature News Deal done over HeLa cell line 10.1038/500132a
Nature News Antibiotic resistance: The last resort 10.1038/499394a
Nature News Cosmologist claims Universe may not be expanding 10.1038/nature.2013.13379
Nature News Zapped malaria parasite raises vaccine hopes 10.1038/nature.2013.13536
Nature News See-through brains clarify connections 10.1038/496151a
Nature News Dolphins remember each other for decades 10.1038/nature.2013.13519
Nature News Researchers turn off Down’s syndrome genes 10.1038/nature.2013.13406
Nature News Astrophysics: Fire in the hole! 10.1038/496020a
Nature News Giant viruses open Pandora's box 10.1038/nature.2013.13410
Nature News Quantum gas goes below absolute zero 10.1038/nature.2013.12146
Nature News Stem cells reprogrammed using chemicals alone 10.1038/nature.2013.13416
Nature News Whole human brain mapped in 3D 10.1038/nature.2013.13245
Nature News Father’s genetic quest pays off 10.1038/498418a
Nature News Tracking whole colonies shows ants make career moves 10.1038/nature.2013.12833
Nature News Pesticides spark broad biodiversity loss 10.1038/nature.2013.13214
Nature News Animal-rights activists wreak havoc in Milan laboratory 10.1038/nature.2013.12847
Nature News Silver makes antibiotics thousands of times more effective 10.1038/nature.2013.13232
Nature News Methane leaks erode green credentials of natural gas 10.1038/493012a
Nature News When Google got flu wrong 10.1038/494155a
Nature News First proof that prime numbers pair up into infinity 10.1038/nature.2013.12989
Nature News Global carbon dioxide levels near worrisome milestone 10.1038/497013a
Nature News Underwater volcano is Earth's biggest 10.1038/nature.2013.13680
Nature News Did a hyper-black hole spawn the Universe? 10.1038/nature.2013.13743
PNAS Private traits and attributes are predictable from digital records of human behavior 10.1073/pnas.1218772110
Nature News How to turn living cells into computers 10.1038/nature.2013.12406
Nature News Small-molecule drug drives cancer cells to suicide 10.1038/nature.2013.12385
Nature News Brain-simulation and graphene projects win billion-euro competition 10.1038/nature.2013.12291
Nature News Rewired nerves control robotic leg 10.1038/nature.2013.13818
Nature News US government shuts down 10.1038/502013a
Lancet Letter Open letter: let us treat patients in Syria 10.1016/s0140-6736(13)61938-8
Nature News Blood engorged mosquito is a fossil first 10.1038/nature.2013.13946
BMJ Cancer risk in 680 000 people exposed to computed tomography scans in childhood or adolescence: data linkage study of 11 million Australians 10.1136/bmj.f2360
Nature News NIH mulls rules for validating key results 10.1038/500014a
PNAS Impact of insufficient sleep on total daily energy expenditure, food intake, and weight gain 10.1073/pnas.1216951110
Nature News Red meat + wrong bacteria = bad news for hearts 10.1038/nature.2013.12746
Nature News Who is the best scientist of them all? 10.1038/nature.2013.14108
Nature News Four-strand DNA structure found in cells 10.1038/nature.2013.12253
Nature News Weak statistical standards implicated in scientific irreproducibility 10.1038/nature.2013.14131
Nature News Mathematicians aim to take publishers out of publishing 10.1038/nature.2013.12243
BMJ Bicycle helmets and the law 10.1136/bmj.f3817
Nature News Barbaric Ostrich: 27th June 2013 10.1038/nature.2013.12487
American J of M The Autopsy of Chicken Nuggets Reads “Chicken Little” 10.1016/j.amjmed.2013.05.005
Nature News Stem cells mimic human brain 10.1038/nature.2013.13617
Nature News Mystery humans spiced up ancients’ sex lives 10.1038/nature.2013.14196
BMJ The future of the NHS--irreversible privatisation? 10.1136/bmj.f1848
Nature News Feature Archaeology: The milk revolution 10.1038/500020a
Nature News Feature Neuroscience: Solving the brain 10.1038/499272a
Nature News Feature Tissue engineering: How to build a heart 10.1038/499020a
Nature News Feature Theoretical physics: The origins of space and time 10.1038/500516a
Nature News Feature Online learning: Campus 2.0 10.1038/495160a
Nature News Feature Open access: The true cost of science publishing 10.1038/495426a
Nature News Feature Inequality quantified: Mind the gender gap 10.1038/495022a
Nature News Feature Voyager: Outward bound 10.1038/497424a
Nature News Feature Mental health: On the spectrum 10.1038/496416a
Nature News Feature Brain decoding: Reading minds 10.1038/502428a
Nature News Feature Fukushima: Fallout of fear 10.1038/493290a
Nature News Feautre The big fat truth 10.1038/497428a

Table 2 -  Full list of the 69 articles belonging to the 0.5% cohort in the Social Activity class including journal, article title, and DOI. Articles highlighted in orange are those representing full-length articles reporting the results of original research.

 

Social media attention: An indicator of scholarly impact or simply newsworthiness?

The articles which appear in the top 0.5% cohort in each of the four classes defined in this study are typically not the same ones: just 2 articles appear in all 4 lists. This suggests that the correlation between these 4 classes of altmetric indicators may not be very high. These two articles are both original research articles, one reporting the development of a method for creating human brain-like structures (called “cerebral organoids”) in cell culture and using these to study the basis of brain development and disease (Nature article “Cerebral organoids model human brain development and microcephaly”); the other correlating online behaviour (in this case, Facebook ‘likes’) with personal information such as sexual orientation, ethnicity and political views, to create a model to predict such traits based solely on Facebook activity (PNAS article “Private traits and attributes are predictable from digital records of human behavior”).

Further analysis of the overlap between the top 0.5% cohorts in each altmetric class is shown in Table 3: by far the greatest overlaps occur between the Mass media and Scholarly commentary classes, the lowest between Social activity and Mass media or Scholarly activity, and a moderate degree of overlap for the remaining pairwise combinations. Taken together, this suggests that - at least amongst this handful of articles receiving the most online attention – articles attracting a high degree of Social activity attract relatively little attention from the Mass media or from Scholarly activity and only a moderate degree of scholarly commentary. Conversely, there is a very high co-occurrence of articles receiving Mass media attention and Scholarly commentary. Taken together, these observations suggest that Social activity in particular is an indicator of a very different kind of online attention than the other three classes.

  Mass media Scholarly activity Scholarly commentary Social activity
Mass media  

11

31

5

Scholarly activity  

14

2

Scholarly commentary  

15

Social activity        

Table 3 -  Co-occurrence counts of articles comprising the top 0.5% of articles in each class, where n varies between classes owing to tied rankings at the 0.5% cutoff between 69 and 76.

Figure 1 shows how this correlation varies across all percentile cohorts for articles with Social activity. Note that approximately 90% of social activity is constrained to 15% of articles, which is a significantly more skewed distribution than that of citations across articles within a journal (where some 90% of citations are to 50% of the articles; (3)).  This implies a scarce attention economy in the Social activity spectrum, with many articles competing for a rare resource (reader attention). The only altmetric class with a distribution of attention across articles similar to that of citations across articles is Scholarly activity (which correlates very poorly with Social activity), where approximately 90% of Scholarly activity is represented by some 30-40% of articles (data not shown). The convergence of the curves in Figure 1 around the 15% cohort implies that at this point attention in all 4 classes is equally scarce, while in the cohorts above this point the only class showing a considerable degree of co-occurrence with Social activity is Scholarly commentary (also borne out by the Table 3 for the 0.5% cohort).

AP-MT 1

Figure 1 - Proportion of total activity per article across predefined percentile ranges (cohorts) for social activity.

 

Conclusions

It is clear from this exploratory work that altmetrics hold great promise as a source of data, indicators and insights about online attention, usage and impact of published research outputs. What is currently less certain is the underlying nature of what is being measured by current indicators represented within the four broad classes analysed here, and what can (and cannot) be read into them for the purposes of assigning credit or assessing research impact at the level of individual researchers, journals, institutions or countries.

What is strikingly clear from the qualitative analysis of the top 0.5% of papers for Social Activity is the lack of mentions of titles that have particularly titillating or eye-catching keywords: although most of the links are to summaries of research, rather than primary research articles themselves, they all contains serious scientific material.

On the basis of this preliminary study, we urge caution in characterizing all altmetric indicators in a similar way, as it is likely that different indicators may measure different types of online attention from different types of readers. This finding is similar to that reported by Priem, Piwowar and Hemminger in 2012 (4). We also suggest that careful delineation of document types (as long used for citation-based indicators) must be applied to correctly evaluate (for example) the relative social activity attracted by a news or editorial item versus an original research article; these values are likely to be the inverse of their usual relationship in citation terms. In short, in the excitement and promise of this burgeoning new field of Informetrics, we must be sure to ask ourselves: what is it that we are measuring, and why?

 

Acknowledgements

This paper would not be possible without the kind support of Euan Adie at Altmetric.com in providing access to these data for research purposes.

 

References

 (1) http://www.dcscience.net/?p=6369
(2) http://www.slideshare.net/StefanieHaustein/haustein-ape2014-30482551
(3) Seglen, P.O. (1992) The skewness of science. Journal of the American Society for Information Science, 43 (9) pp. 628–638.
(4) Priem, J., Piwowar, H., & Hemminger, B. (2012) Altmetrics in the wild: Using social media to explore scholarly impact. Arxiv. http://arxiv.org/abs/1203.4745 

The data set this paper was  based on is available online:

Taylor, Michael (2014): Data set for " Party papers or policy discussions: an examination of highly shared papers using altmetric data". figshare. http://dx.doi.org/10.6084/m9.figshare.943471

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Which scientific stories are most shared on social media networks such as Twitter and Facebook?

Research on human health and social issues are often perceived as being the most shared scientific stories on social media networks such as Twitter and Facebook and – given their mainstream appeal – are often suggested to dominate the popular discussion around scholarly research online, but skeptics, such as David Calquhoun argue for their irrelevance: “Scientific works get tweeted about mostly because they have titles that contain buzzwords, not because they represent great science” (1).

So which is it to be? And do articles attracting social media attention also get the attention of scholars and the mass media? In this article, we seek to provide an approach to answering these questions.

With the rise of online scholarly publishing and the concomitant rise in the desire to create indicators of online attention to research articles and related outputs have come a number of providers of article-level data. A leading commercial provider of such data - collectively known as ‘altmetrics’ - is Altmetric.com, which tracks a variety of different indicators in four broad groups: Social Activity (e.g. Tweets and Facebook mentions), Mass Media (e.g. mentions on news sites such as BBC and CNN), Scholarly Commentary (e.g. mentions in scientific blogs), and Scholarly Activity (e.g. articles in reader libraries such as Mendeley). The overall collection and analysis of these references are brought together under the label “altmetrics”.

In terms of the volume of online mentions of scholarly articles, Twitter and other social networks provide by far the largest number of data points. However, given Twitter’s broad user base (the majority being non-academics) and limited information content (being restricted to 140 characters per tweet), other indicators may be more significant in terms of understanding scholarly usage (2). For example, Mendeley and CiteULike are examples of sharing and collaboration platforms used predominantly by researchers, while the mass media and scientific blogs tracked by Altmetric.com are written by professional science journalists or researchers themselves.

 

Methodology

Data were collected from the Altmetric.com API over four months ending January 17th, 2014. On this date, the latest altmetric indicator data for all papers published in a selection of journals in 2013 with any online mentions captured by Altmetric.com were downloaded for analysis; in total, 13,793 articles with at least one altmetric indicator datapoint were included in this study. Please note, the actual Journals monitored are detailed in the raw dataset, which is published on Figshare.

The Altmetric.com data includes counts of online attention at article level from across a variety of different data sources. In order to simplify data analysis, we aggregated data counts into the four classes as defined above: Social Activity, Mass Media, Scholarly Commentary, and Scholarly Activity. For each class, articles were assigned to predefined percentile ranges (cohorts) based on the frequency of online mentions (see Table 1).

Cohorts Number of articles included

0.5%

69

1%

138

2.5%

347

5%

691

7.5%

1,045

10%

1,384

15%

2,095

20%

2,775

25%

3,539

30%

4,332

100%

13,793

Table 1 - Cohorts of articles based on the frequency of online attention within each class.

For example, the 69 papers comprising the top 0.5% of social activity comprise 91,470 social actions, 445 mass media mentions, 540 scholarly comments and 1,571 scholarly actions, whereas the top 69 papers comprising the top 0.5% of mass media activity comprise 2,638 mass media mentions, 16,221 social actions, 779 scholarly comments and 4,856 scholarly actions.

 

 Analysis

 Headline-grabbers: Which articles got most social media attention in 2013?

Of the 69 articles belonging to the 0.5% cohort in the Social Activity class (i.e. those articles most frequently mentioned in social media such as Twitter and Facebook, for example), just 8 of them are full-length articles reporting the results of original research. The remainder are typically editorial features or news items from leading weekly journals such as The Lancet, BMJ and Nature; see table 2 for the complete list. The original research articles cover topics in the popular consciousness including climate change, human health and diet, and online information and privacy: intuitively, the sort of articles one might expect to see attracting broad popular attention online. However, one article appears to have a less obvious popular slant (the Nature letter “Attractive photons in a quantum nonlinear medium”) but closer examination shows that it describes a novel technique for forcing photons to interact in a quantum nonlinear medium which may have applications in quantum processing, where the ability to have photons ‘see’ each other could overcome present technological limitations.

The remaining 61 articles (almost exclusively news and editorial features about original research reported elsewhere) cover a variety of topics including several on topics close to the heart of the academy: research careers, science funding, the future of higher education and scholarly publishing. The preponderance of items in this group from Nature (primarily the Nature News and Nature News Feature sections of the publication) suggest that Social activity may be more likely to reflect attention to short journalistic versions of current research results rather than the original research articles themselves; a worthy follow-up to this study would be to track the variation in performance across altmetric indicator classes of an original research article and the current awareness ‘news-worthy’ version of the same research.

 

Journal Article title DOI
Nature Cerebral organoids model human brain development and microcephaly 10.1038/nature12517
Nature Comment Climate science: Vast costs of Arctic change 10.1038/499401a
Nature Comment Neuroscience: My life with Parkinson's 10.1038/503029a
Nature Editorial Nuclear error 10.1038/501005b
Nature Editorial Science for all 10.1038/495005a
Nature Letter No increase in global temperature variability despite changing regional patterns 10.1038/nature12310
Nature Letter Attractive photons in a quantum nonlinear medium 10.1038/nature12512
Nature News Brazilian citation scheme outed 10.1038/500510a
Nature News Half of 2011 papers now free to read 10.1038/500386a
Nature News World's slowest-moving drop caught on camera at last 10.1038/nature.2013.13418
Nature News Genetically modified crops pass benefits to weeds 10.1038/nature.2013.13517
Nature News NSF cancels political-science grant cycle 10.1038/nature.2013.13501
Nature News Deal done over HeLa cell line 10.1038/500132a
Nature News Antibiotic resistance: The last resort 10.1038/499394a
Nature News Cosmologist claims Universe may not be expanding 10.1038/nature.2013.13379
Nature News Zapped malaria parasite raises vaccine hopes 10.1038/nature.2013.13536
Nature News See-through brains clarify connections 10.1038/496151a
Nature News Dolphins remember each other for decades 10.1038/nature.2013.13519
Nature News Researchers turn off Down’s syndrome genes 10.1038/nature.2013.13406
Nature News Astrophysics: Fire in the hole! 10.1038/496020a
Nature News Giant viruses open Pandora's box 10.1038/nature.2013.13410
Nature News Quantum gas goes below absolute zero 10.1038/nature.2013.12146
Nature News Stem cells reprogrammed using chemicals alone 10.1038/nature.2013.13416
Nature News Whole human brain mapped in 3D 10.1038/nature.2013.13245
Nature News Father’s genetic quest pays off 10.1038/498418a
Nature News Tracking whole colonies shows ants make career moves 10.1038/nature.2013.12833
Nature News Pesticides spark broad biodiversity loss 10.1038/nature.2013.13214
Nature News Animal-rights activists wreak havoc in Milan laboratory 10.1038/nature.2013.12847
Nature News Silver makes antibiotics thousands of times more effective 10.1038/nature.2013.13232
Nature News Methane leaks erode green credentials of natural gas 10.1038/493012a
Nature News When Google got flu wrong 10.1038/494155a
Nature News First proof that prime numbers pair up into infinity 10.1038/nature.2013.12989
Nature News Global carbon dioxide levels near worrisome milestone 10.1038/497013a
Nature News Underwater volcano is Earth's biggest 10.1038/nature.2013.13680
Nature News Did a hyper-black hole spawn the Universe? 10.1038/nature.2013.13743
PNAS Private traits and attributes are predictable from digital records of human behavior 10.1073/pnas.1218772110
Nature News How to turn living cells into computers 10.1038/nature.2013.12406
Nature News Small-molecule drug drives cancer cells to suicide 10.1038/nature.2013.12385
Nature News Brain-simulation and graphene projects win billion-euro competition 10.1038/nature.2013.12291
Nature News Rewired nerves control robotic leg 10.1038/nature.2013.13818
Nature News US government shuts down 10.1038/502013a
Lancet Letter Open letter: let us treat patients in Syria 10.1016/s0140-6736(13)61938-8
Nature News Blood engorged mosquito is a fossil first 10.1038/nature.2013.13946
BMJ Cancer risk in 680 000 people exposed to computed tomography scans in childhood or adolescence: data linkage study of 11 million Australians 10.1136/bmj.f2360
Nature News NIH mulls rules for validating key results 10.1038/500014a
PNAS Impact of insufficient sleep on total daily energy expenditure, food intake, and weight gain 10.1073/pnas.1216951110
Nature News Red meat + wrong bacteria = bad news for hearts 10.1038/nature.2013.12746
Nature News Who is the best scientist of them all? 10.1038/nature.2013.14108
Nature News Four-strand DNA structure found in cells 10.1038/nature.2013.12253
Nature News Weak statistical standards implicated in scientific irreproducibility 10.1038/nature.2013.14131
Nature News Mathematicians aim to take publishers out of publishing 10.1038/nature.2013.12243
BMJ Bicycle helmets and the law 10.1136/bmj.f3817
Nature News Barbaric Ostrich: 27th June 2013 10.1038/nature.2013.12487
American J of M The Autopsy of Chicken Nuggets Reads “Chicken Little” 10.1016/j.amjmed.2013.05.005
Nature News Stem cells mimic human brain 10.1038/nature.2013.13617
Nature News Mystery humans spiced up ancients’ sex lives 10.1038/nature.2013.14196
BMJ The future of the NHS--irreversible privatisation? 10.1136/bmj.f1848
Nature News Feature Archaeology: The milk revolution 10.1038/500020a
Nature News Feature Neuroscience: Solving the brain 10.1038/499272a
Nature News Feature Tissue engineering: How to build a heart 10.1038/499020a
Nature News Feature Theoretical physics: The origins of space and time 10.1038/500516a
Nature News Feature Online learning: Campus 2.0 10.1038/495160a
Nature News Feature Open access: The true cost of science publishing 10.1038/495426a
Nature News Feature Inequality quantified: Mind the gender gap 10.1038/495022a
Nature News Feature Voyager: Outward bound 10.1038/497424a
Nature News Feature Mental health: On the spectrum 10.1038/496416a
Nature News Feature Brain decoding: Reading minds 10.1038/502428a
Nature News Feature Fukushima: Fallout of fear 10.1038/493290a
Nature News Feautre The big fat truth 10.1038/497428a

Table 2 -  Full list of the 69 articles belonging to the 0.5% cohort in the Social Activity class including journal, article title, and DOI. Articles highlighted in orange are those representing full-length articles reporting the results of original research.

 

Social media attention: An indicator of scholarly impact or simply newsworthiness?

The articles which appear in the top 0.5% cohort in each of the four classes defined in this study are typically not the same ones: just 2 articles appear in all 4 lists. This suggests that the correlation between these 4 classes of altmetric indicators may not be very high. These two articles are both original research articles, one reporting the development of a method for creating human brain-like structures (called “cerebral organoids”) in cell culture and using these to study the basis of brain development and disease (Nature article “Cerebral organoids model human brain development and microcephaly”); the other correlating online behaviour (in this case, Facebook ‘likes’) with personal information such as sexual orientation, ethnicity and political views, to create a model to predict such traits based solely on Facebook activity (PNAS article “Private traits and attributes are predictable from digital records of human behavior”).

Further analysis of the overlap between the top 0.5% cohorts in each altmetric class is shown in Table 3: by far the greatest overlaps occur between the Mass media and Scholarly commentary classes, the lowest between Social activity and Mass media or Scholarly activity, and a moderate degree of overlap for the remaining pairwise combinations. Taken together, this suggests that - at least amongst this handful of articles receiving the most online attention – articles attracting a high degree of Social activity attract relatively little attention from the Mass media or from Scholarly activity and only a moderate degree of scholarly commentary. Conversely, there is a very high co-occurrence of articles receiving Mass media attention and Scholarly commentary. Taken together, these observations suggest that Social activity in particular is an indicator of a very different kind of online attention than the other three classes.

  Mass media Scholarly activity Scholarly commentary Social activity
Mass media  

11

31

5

Scholarly activity  

14

2

Scholarly commentary  

15

Social activity        

Table 3 -  Co-occurrence counts of articles comprising the top 0.5% of articles in each class, where n varies between classes owing to tied rankings at the 0.5% cutoff between 69 and 76.

Figure 1 shows how this correlation varies across all percentile cohorts for articles with Social activity. Note that approximately 90% of social activity is constrained to 15% of articles, which is a significantly more skewed distribution than that of citations across articles within a journal (where some 90% of citations are to 50% of the articles; (3)).  This implies a scarce attention economy in the Social activity spectrum, with many articles competing for a rare resource (reader attention). The only altmetric class with a distribution of attention across articles similar to that of citations across articles is Scholarly activity (which correlates very poorly with Social activity), where approximately 90% of Scholarly activity is represented by some 30-40% of articles (data not shown). The convergence of the curves in Figure 1 around the 15% cohort implies that at this point attention in all 4 classes is equally scarce, while in the cohorts above this point the only class showing a considerable degree of co-occurrence with Social activity is Scholarly commentary (also borne out by the Table 3 for the 0.5% cohort).

AP-MT 1

Figure 1 - Proportion of total activity per article across predefined percentile ranges (cohorts) for social activity.

 

Conclusions

It is clear from this exploratory work that altmetrics hold great promise as a source of data, indicators and insights about online attention, usage and impact of published research outputs. What is currently less certain is the underlying nature of what is being measured by current indicators represented within the four broad classes analysed here, and what can (and cannot) be read into them for the purposes of assigning credit or assessing research impact at the level of individual researchers, journals, institutions or countries.

What is strikingly clear from the qualitative analysis of the top 0.5% of papers for Social Activity is the lack of mentions of titles that have particularly titillating or eye-catching keywords: although most of the links are to summaries of research, rather than primary research articles themselves, they all contains serious scientific material.

On the basis of this preliminary study, we urge caution in characterizing all altmetric indicators in a similar way, as it is likely that different indicators may measure different types of online attention from different types of readers. This finding is similar to that reported by Priem, Piwowar and Hemminger in 2012 (4). We also suggest that careful delineation of document types (as long used for citation-based indicators) must be applied to correctly evaluate (for example) the relative social activity attracted by a news or editorial item versus an original research article; these values are likely to be the inverse of their usual relationship in citation terms. In short, in the excitement and promise of this burgeoning new field of Informetrics, we must be sure to ask ourselves: what is it that we are measuring, and why?

 

Acknowledgements

This paper would not be possible without the kind support of Euan Adie at Altmetric.com in providing access to these data for research purposes.

 

References

 (1) http://www.dcscience.net/?p=6369
(2) http://www.slideshare.net/StefanieHaustein/haustein-ape2014-30482551
(3) Seglen, P.O. (1992) The skewness of science. Journal of the American Society for Information Science, 43 (9) pp. 628–638.
(4) Priem, J., Piwowar, H., & Hemminger, B. (2012) Altmetrics in the wild: Using social media to explore scholarly impact. Arxiv. http://arxiv.org/abs/1203.4745 

The data set this paper was  based on is available online:

Taylor, Michael (2014): Data set for " Party papers or policy discussions: an examination of highly shared papers using altmetric data". figshare. http://dx.doi.org/10.6084/m9.figshare.943471

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Article downloads: An alternative indicator of national research impact and cross-sector knowledge exchange

Dr. Andrew Plume and Dr. Judith Kamalski demonstrate how download data can be used in research assessment to offer a different perspective on national research impact, and to give a unique view of knowledge exchange between authors and readers in the academic and corporate sectors.

Read more >


To date, the rise of alternative metrics as supplementary indicators for assessing the value and impact of research articles has focussed primarily on the article and/or author level. However, such metrics – which may include social media mentions, coverage in traditional print and online media, and full-text download counts – have seldom previously been applied to higher levels of aggregation such as research topics, journals, institutions, or countries. In particular, the use of article download counts (also known as article usage statistics) have not been used in this way owing to the difficulty in aggregating download counts for articles across multiple publisher platforms to derive a holistic view. While the meaning of a download, defined as the event where a user views the full-text HTML of an article or downloads the full-text PDF of an article from a full-text journal article platform, remains a matter of debate, it is generally considered to represent an indication of reader interest and/or research impact (1, 2, 3).

 

As part of the report ‘International Comparative Performance of the UK Research Base: 2013’, commissioned by the UK’s Department for Business, Innovation and Skills (BIS), download data were used in two different ways to unlock insights not otherwise possible from more traditional, citation-based indicators. In the report, published in December 2013, download data were used alongside citation data in international comparisons to offer a different perspective on national research impact, and were also used to give a unique view of knowledge exchange between authors and readers in two distinct but entwined segments of the research-performing and research-consuming landscape: the academic and corporate sectors.

 

Comparing national research impact using a novel indicator derived from article download counts

Citation impact is by definition a lagging indicator: newly-published articles need to be read, after which they might influence studies that will be, are being, or have been carried out, which are then written up in manuscript form, peer-reviewed, published and finally included in a citation index such as Scopus. Only after these steps are completed can citations to earlier articles be systematically counted. Typically, a citation window of three to five year following the year if publication is proven to provide reliable results (4).  For this reason, investigating downloads has become an appealing alternative, since it is possible to start counting downloads of full-text articles immediately upon online publication and to derive robust indicators over windows of months rather than years.

While there is a considerable body of literature on the meaning of citations and indicators derived from them (5, 6), the relatively recent advent of download-derived indicators means that there is no clear consensus on the nature of the phenomenon that is measured by download counts (7). A small body of research has concluded however that download counts may be a weak predictor of subsequent citation counts at the article level (8).

To gain a different perspective on national research impact, a novel indicator called field-weighted download impact (FWDI) has been developed according to the same principles applied to the calculation of field-weighted citation impact (FWCI; a Snowball metric). The impact of a publication, whether measured through citations or downloads, is normalised for discipline specific behaviours. Since full-text journal articles reside on a variety of publisher and aggregator websites, there is no central database of download statistics available for comparative analysis; instead, Elsevier’s full-text journal article platform ScienceDirect (representing some 16% of the articles indexed in Scopus) was used with the assumption that downloading behaviour across countries does not systematically differ between online platforms. However, t there is an important difference between FWCI and FWDI in this respect: the calculation of FWCI relates to all target articles published in Scopus-covered journals, whereas  FWDI relates to target articles published in Elsevier journals only.  The effect of such differences will be tested in upcoming research. In the current approach, a download is defined as the event where a user views the full-text HTML of an article or downloads the full-text PDF of an article from ScienceDirect; views of an article abstract alone, and multiple full-text HTML views or PDF downloads of the same article during the same user session, are not included in accordance with the COUNTER Code of Practice.

A comparison of the FWCI (derived from Scopus data) and FWDI in 2012 across 10 major research fields for selected countries is shown in Figure 1. The first point of note about the comparison is that typically, FWDI is more consistent across fields and between countries. It is possible that this observation may reflect an underlying convergence of FWDI between fields and across countries owing to a greater degree of universality in download behaviour (i.e. reader interest or an intention to read an article as expressed by article downloads) than in citation behaviour, but this is not possible to discern from analysis of these indicators themselves and remains untested.

Nonetheless, FWDI does appear to offer an interesting supplementary view of a country’s research impact; for example, the relatively rounded and consistent FWCI and FWDI values across fields for established research powerhouses such as the UK, USA, Japan, Italy, France, Germany and Canada contrasts with the much less uniform patterns of field-weighted citation impact across research fields for the emergent research nations of Brazil, Russia, India and China, for which field-weighted citation impact is typically lower and more variable across research fields than field-weighted download impact. This observation suggests that for these countries reader interest expressed through article downloads is not converted at a very high rate to citations. Again, this points to the idea that users download (and by implication, read) widely across the literature but cite more selectively, and may reflect differences in the ease (and meaning) of downloading versus citing. Another possible explanation lies in the fact that depending on the country, there may be weaker or stronger overlap between the reading and the citing communities. A third aspect that may be relevant here is regional coverage. Publications from these countries with a weak link between downloads and citations may be  preferentially downloaded by authors from these same countries, only to be cited afterwards in local journals that are not as extensively covered in Scopus as English language journals.

BIS Figure__1 UKBIS Figure__1 USA
BIS Figure__1 RUS
BIS Figure__1 JPN
BIS Figure__1 ITA
BIS Figure__1 IND
BIS Figure__1 FRA
BIS Figure__1 DEU
BIS Figure__1 CHN
BIS Figure__1 CAN
BIS Figure__1 BRA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

BIS Fig 1 Legend

 Figure 1 — Field-weighted citation impact (FWCI) and field-weighted download impact (FWDI) for selected countries across ten research fields in 2012. For all research fields, a field-weighted citation or download impact of 1.0 equals world average in that particular research field. Note that the axis maximum is increased for Italy (to 2.5). Source: Scopus and ScienceDirect.

 

Examining authorship and article download activity by corporate and academic authors and users as a novel indicator of cross-sector knowledge exchange

Knowledge exchange is a two-way transfer of ideas and information; in research policy terms the focus is typically on academic-industry knowledge exchange as a conduit between public sector investment in research and its private sector commercialisation, ultimately leading to economic growth. Knowledge exchange is a complex and multi-dimensional phenomenon, the essence of which cannot be wholly captured with indicator-based approaches, and since knowledge resides with people and not in documents, much knowledge is tacit or difficult to articulate. Despite this, meaningful indicators of knowledge exchange are still required to inform evidence-led policy. To that end, a unique view of knowledge exchange between authors and readers in academia and corporate affiliations can be derived by analysis of the downloading sector of articles with at least one corporate author, and the authorship sector of the articles downloaded by corporate users.

Given the context of the ‘International Comparative Performance of the UK Research Base: 2013’ report, this was done on the basis of UK corporate-authored articles and UK-based corporate users. Again, ScienceDirect data was used under the assumption that downloading behaviour across sectors (academic and corporate in this analysis) does not systematically differ between online platforms.

A view of the share of downloads of articles with at least one author with a corporate affiliation (derived from Scopus) by downloading sector (as defined within ScienceDirect) in two consecutive and non-overlapping time periods is shown in Figure 2. Downloading of UK articles with one or more authors with a corporate affiliation by users in other UK sectors indicates strong cross-sector knowledge flows within the country. 61.7% of all downloads of corporate-authored articles in the period 2008-12 came from users in the academic sector (see Figure 2), an increase of 1.1% over the equivalent share of 60.6% for the period 2003-07. Users in the corporate sector themselves accounted for 35.2% of downloads of corporate-authored articles in the period 2008-12, a decrease of -1.0% on the 36.2% share in the period 2003-07. Taken together, these results indicate high and increasing usage of corporate-authored research by the academic sector.

BIS Figure__2

Figure 2 — Share of downloads of articles with at least one corporate author by downloading sector, 2003-07 and 2008-12. Source: Scopus and ScienceDirect.

 

A view of the share of downloads of articles by users in the corporate sector (as defined within ScienceDirect) by author affiliation (derived from Scopus) in the same two time periods is shown in Figure 3. Downloading of UK articles by users in the UK corporate sector also suggests increasing cross-sector knowledge flows within the country. Some 52.6% of all downloads by corporate users in the period 2008-12 were of articles with one or more authors with an academic affiliation, and 32.5% were of articles with one or more corporate authors (see Figure 3). Both of these shares have increased (by 1.3% and 2.1%, respectively) over the equivalent shares for the period 2003-07, while the share of articles with at least one author with a medical affiliation downloaded by corporate users has decreased from one period to the next. Taken together, these results indicate high and increasing usage of UK academic-authored research by the UK corporate sector.

BIS Figure__3

Figure 3 — Share of article downloads by corporate sector, 2003-07 and 2008-12. Shares add to 100% despite co-authorship of some articles between sectors owing to the derivation of shares from the duplicated total download count across all sectors. Source: Scopus and ScienceDirect.

 

Article downloads as a novel indicator: conclusion

In the ‘International Comparative Performance of the UK Research Base: 2013’ report, download data were used alongside citation data in international comparisons to help uncover fresh insights into the performance of the UK as a national research system in an international context.

Nevertheless, some methodological questions remain to be answered. Clearly, the assumption that download behaviours do not differ across platforms needs to be put to the test in future research. The analysis on the relationship between FWCI and FWDI showed how this differs from one country to another. The examples provided in for downloading publications from a different sector are focused on the UK solely, and should be complemented with views on other countries.

We envisage that the approaches outlined in this article, now quite novel, will one day become commonplace in the toolkits of those involved in research performance assessments globally, to the benefit of research, researchers, and society.

 

References

(1)    Moed, H.F. (2005a) “Statistical relationships between downloads and citations at the level of individual documents within a single journal” Journal of the American Society for Information Science and Technology 56 (10) pp. 1088-1097
(2)    Schloegl, C. & Gorraiz, J. (2010) “Comparison of citation and usage indicators: The case of oncology journals” Scientometrics 82 (3) pp. 567-580
(3)    Schloegl, C. & Gorraiz, J. (2011) “Global usage versus global citation metrics: The case of pharmacology journals” Journal of the American Society for Information Science and Technology 62 (1) pp. 161-170.
(4)    Moed. H.F. (2005b). Citation Analysis in Research Evaluation. Dordrecht: Springer, pp. 81.
(5)    Cronin, B. (2005) "A hundred million acts of whimsy?" Current Science 89 (9) pp. 1505-1509
(6)    Bornmann, L., Daniel, H. (2008) "What do citation counts measure? A review of studies on citing behavior" Journal of Documentation 64 (1) pp. 45-80.
(7)    Kurtz, M.J., & Bollen, J. (2010) “Usage Bibliometrics” Annual Review of Information Science and Technology 44 (1) pp. 3-64.
(8)    See http://www.snowballmetrics.com/.
 


 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

To date, the rise of alternative metrics as supplementary indicators for assessing the value and impact of research articles has focussed primarily on the article and/or author level. However, such metrics – which may include social media mentions, coverage in traditional print and online media, and full-text download counts – have seldom previously been applied to higher levels of aggregation such as research topics, journals, institutions, or countries. In particular, the use of article download counts (also known as article usage statistics) have not been used in this way owing to the difficulty in aggregating download counts for articles across multiple publisher platforms to derive a holistic view. While the meaning of a download, defined as the event where a user views the full-text HTML of an article or downloads the full-text PDF of an article from a full-text journal article platform, remains a matter of debate, it is generally considered to represent an indication of reader interest and/or research impact (1, 2, 3).

 

As part of the report ‘International Comparative Performance of the UK Research Base: 2013’, commissioned by the UK’s Department for Business, Innovation and Skills (BIS), download data were used in two different ways to unlock insights not otherwise possible from more traditional, citation-based indicators. In the report, published in December 2013, download data were used alongside citation data in international comparisons to offer a different perspective on national research impact, and were also used to give a unique view of knowledge exchange between authors and readers in two distinct but entwined segments of the research-performing and research-consuming landscape: the academic and corporate sectors.

 

Comparing national research impact using a novel indicator derived from article download counts

Citation impact is by definition a lagging indicator: newly-published articles need to be read, after which they might influence studies that will be, are being, or have been carried out, which are then written up in manuscript form, peer-reviewed, published and finally included in a citation index such as Scopus. Only after these steps are completed can citations to earlier articles be systematically counted. Typically, a citation window of three to five year following the year if publication is proven to provide reliable results (4).  For this reason, investigating downloads has become an appealing alternative, since it is possible to start counting downloads of full-text articles immediately upon online publication and to derive robust indicators over windows of months rather than years.

While there is a considerable body of literature on the meaning of citations and indicators derived from them (5, 6), the relatively recent advent of download-derived indicators means that there is no clear consensus on the nature of the phenomenon that is measured by download counts (7). A small body of research has concluded however that download counts may be a weak predictor of subsequent citation counts at the article level (8).

To gain a different perspective on national research impact, a novel indicator called field-weighted download impact (FWDI) has been developed according to the same principles applied to the calculation of field-weighted citation impact (FWCI; a Snowball metric). The impact of a publication, whether measured through citations or downloads, is normalised for discipline specific behaviours. Since full-text journal articles reside on a variety of publisher and aggregator websites, there is no central database of download statistics available for comparative analysis; instead, Elsevier’s full-text journal article platform ScienceDirect (representing some 16% of the articles indexed in Scopus) was used with the assumption that downloading behaviour across countries does not systematically differ between online platforms. However, t there is an important difference between FWCI and FWDI in this respect: the calculation of FWCI relates to all target articles published in Scopus-covered journals, whereas  FWDI relates to target articles published in Elsevier journals only.  The effect of such differences will be tested in upcoming research. In the current approach, a download is defined as the event where a user views the full-text HTML of an article or downloads the full-text PDF of an article from ScienceDirect; views of an article abstract alone, and multiple full-text HTML views or PDF downloads of the same article during the same user session, are not included in accordance with the COUNTER Code of Practice.

A comparison of the FWCI (derived from Scopus data) and FWDI in 2012 across 10 major research fields for selected countries is shown in Figure 1. The first point of note about the comparison is that typically, FWDI is more consistent across fields and between countries. It is possible that this observation may reflect an underlying convergence of FWDI between fields and across countries owing to a greater degree of universality in download behaviour (i.e. reader interest or an intention to read an article as expressed by article downloads) than in citation behaviour, but this is not possible to discern from analysis of these indicators themselves and remains untested.

Nonetheless, FWDI does appear to offer an interesting supplementary view of a country’s research impact; for example, the relatively rounded and consistent FWCI and FWDI values across fields for established research powerhouses such as the UK, USA, Japan, Italy, France, Germany and Canada contrasts with the much less uniform patterns of field-weighted citation impact across research fields for the emergent research nations of Brazil, Russia, India and China, for which field-weighted citation impact is typically lower and more variable across research fields than field-weighted download impact. This observation suggests that for these countries reader interest expressed through article downloads is not converted at a very high rate to citations. Again, this points to the idea that users download (and by implication, read) widely across the literature but cite more selectively, and may reflect differences in the ease (and meaning) of downloading versus citing. Another possible explanation lies in the fact that depending on the country, there may be weaker or stronger overlap between the reading and the citing communities. A third aspect that may be relevant here is regional coverage. Publications from these countries with a weak link between downloads and citations may be  preferentially downloaded by authors from these same countries, only to be cited afterwards in local journals that are not as extensively covered in Scopus as English language journals.

BIS Figure__1 UKBIS Figure__1 USA
BIS Figure__1 RUS
BIS Figure__1 JPN
BIS Figure__1 ITA
BIS Figure__1 IND
BIS Figure__1 FRA
BIS Figure__1 DEU
BIS Figure__1 CHN
BIS Figure__1 CAN
BIS Figure__1 BRA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

BIS Fig 1 Legend

 Figure 1 — Field-weighted citation impact (FWCI) and field-weighted download impact (FWDI) for selected countries across ten research fields in 2012. For all research fields, a field-weighted citation or download impact of 1.0 equals world average in that particular research field. Note that the axis maximum is increased for Italy (to 2.5). Source: Scopus and ScienceDirect.

 

Examining authorship and article download activity by corporate and academic authors and users as a novel indicator of cross-sector knowledge exchange

Knowledge exchange is a two-way transfer of ideas and information; in research policy terms the focus is typically on academic-industry knowledge exchange as a conduit between public sector investment in research and its private sector commercialisation, ultimately leading to economic growth. Knowledge exchange is a complex and multi-dimensional phenomenon, the essence of which cannot be wholly captured with indicator-based approaches, and since knowledge resides with people and not in documents, much knowledge is tacit or difficult to articulate. Despite this, meaningful indicators of knowledge exchange are still required to inform evidence-led policy. To that end, a unique view of knowledge exchange between authors and readers in academia and corporate affiliations can be derived by analysis of the downloading sector of articles with at least one corporate author, and the authorship sector of the articles downloaded by corporate users.

Given the context of the ‘International Comparative Performance of the UK Research Base: 2013’ report, this was done on the basis of UK corporate-authored articles and UK-based corporate users. Again, ScienceDirect data was used under the assumption that downloading behaviour across sectors (academic and corporate in this analysis) does not systematically differ between online platforms.

A view of the share of downloads of articles with at least one author with a corporate affiliation (derived from Scopus) by downloading sector (as defined within ScienceDirect) in two consecutive and non-overlapping time periods is shown in Figure 2. Downloading of UK articles with one or more authors with a corporate affiliation by users in other UK sectors indicates strong cross-sector knowledge flows within the country. 61.7% of all downloads of corporate-authored articles in the period 2008-12 came from users in the academic sector (see Figure 2), an increase of 1.1% over the equivalent share of 60.6% for the period 2003-07. Users in the corporate sector themselves accounted for 35.2% of downloads of corporate-authored articles in the period 2008-12, a decrease of -1.0% on the 36.2% share in the period 2003-07. Taken together, these results indicate high and increasing usage of corporate-authored research by the academic sector.

BIS Figure__2

Figure 2 — Share of downloads of articles with at least one corporate author by downloading sector, 2003-07 and 2008-12. Source: Scopus and ScienceDirect.

 

A view of the share of downloads of articles by users in the corporate sector (as defined within ScienceDirect) by author affiliation (derived from Scopus) in the same two time periods is shown in Figure 3. Downloading of UK articles by users in the UK corporate sector also suggests increasing cross-sector knowledge flows within the country. Some 52.6% of all downloads by corporate users in the period 2008-12 were of articles with one or more authors with an academic affiliation, and 32.5% were of articles with one or more corporate authors (see Figure 3). Both of these shares have increased (by 1.3% and 2.1%, respectively) over the equivalent shares for the period 2003-07, while the share of articles with at least one author with a medical affiliation downloaded by corporate users has decreased from one period to the next. Taken together, these results indicate high and increasing usage of UK academic-authored research by the UK corporate sector.

BIS Figure__3

Figure 3 — Share of article downloads by corporate sector, 2003-07 and 2008-12. Shares add to 100% despite co-authorship of some articles between sectors owing to the derivation of shares from the duplicated total download count across all sectors. Source: Scopus and ScienceDirect.

 

Article downloads as a novel indicator: conclusion

In the ‘International Comparative Performance of the UK Research Base: 2013’ report, download data were used alongside citation data in international comparisons to help uncover fresh insights into the performance of the UK as a national research system in an international context.

Nevertheless, some methodological questions remain to be answered. Clearly, the assumption that download behaviours do not differ across platforms needs to be put to the test in future research. The analysis on the relationship between FWCI and FWDI showed how this differs from one country to another. The examples provided in for downloading publications from a different sector are focused on the UK solely, and should be complemented with views on other countries.

We envisage that the approaches outlined in this article, now quite novel, will one day become commonplace in the toolkits of those involved in research performance assessments globally, to the benefit of research, researchers, and society.

 

References

(1)    Moed, H.F. (2005a) “Statistical relationships between downloads and citations at the level of individual documents within a single journal” Journal of the American Society for Information Science and Technology 56 (10) pp. 1088-1097
(2)    Schloegl, C. & Gorraiz, J. (2010) “Comparison of citation and usage indicators: The case of oncology journals” Scientometrics 82 (3) pp. 567-580
(3)    Schloegl, C. & Gorraiz, J. (2011) “Global usage versus global citation metrics: The case of pharmacology journals” Journal of the American Society for Information Science and Technology 62 (1) pp. 161-170.
(4)    Moed. H.F. (2005b). Citation Analysis in Research Evaluation. Dordrecht: Springer, pp. 81.
(5)    Cronin, B. (2005) "A hundred million acts of whimsy?" Current Science 89 (9) pp. 1505-1509
(6)    Bornmann, L., Daniel, H. (2008) "What do citation counts measure? A review of studies on citing behavior" Journal of Documentation 64 (1) pp. 45-80.
(7)    Kurtz, M.J., & Bollen, J. (2010) “Usage Bibliometrics” Annual Review of Information Science and Technology 44 (1) pp. 3-64.
(8)    See http://www.snowballmetrics.com/.
 


 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Research evaluation in Israel: Interview with Dr. Daphne Getz

For this issue Dr. Gali Halevi interviewed Dr. Daphne Getz, to hear more about the way research evaluation is carried out in Israel, and in particular on the recent report on Israeli scientific output published by the Samuel Neaman Institute, for which she was the lead investigator.

Read more >


In 2013, the Samuel Neaman Institute published a report covering the Israeli scientific output from 1990 to 2011, identifying the country’s leading scientific disciplines and comparing them to countries around the world. With its unique geographical location and demographic composition, Israel presents an interesting case of scientific capabilities and output as well as collaborative trends. For this issue, we interviewed Daphne Getz, the lead investigator of this report.

Dr Daphne Getz

Dr. Daphne Getz is the head of the CESTIP (Center of Excellence in Science, Technology and Innovation Policies), and has been a senior research fellow at the Samuel Neaman Institute (SNI) since 1996. Dr. Getz is a specialist in R&D policy, technology and innovation, policies on new and emerging technologies, and relationships between academia, industry and government, among others. She has represented the academia and the Technion (Israel Institute of Technology) in the MAGNET R&D Consortia and also represents Israeli academia in several EU and UN projects. She has a D.Sc. from the Technion in Physical Chemistry and has served in several positions related to R&D management in the industry. Over the years, Dr. Getz has initiated numerous projects, including Israeli indicators for Science, Technology and Innovation, evaluation of R&D programs, and the evaluation of Israeli R&D outputs using Bibliometrics.

 

Could you briefly describe SNI (Samuel Neaman Institute), its core activities and role in informing science policy in Israel?

Samuel Neaman Institute (SNI) is an Israeli organization established in 1978 at the Technion (the Israel Institute of Technology). Its main objective is to conduct independent multi-disciplinary research and provide insights into Israel’s Science, Technology & Innovation (STI), education, economy and industry as well as infrastructure and social development for policy makers. The institute has a key role in outlining Israel’s national policies in science, technology and higher education and serves decision makers through its research projects and surveys. The institute operates within the framework of a budget funded by Mr. Samuel Neaman and external research grants from the Ministry of Science, Technology and Space, the Office of the Chief Scientist in the Ministry of Economy, the Ministry for Environmental Protection, the European Commission’s Seventh Framework Programme grants, and more. SNI employees are highly professional analysts chosen because of their level of expertise in different disciplines. Each year, the institute conducts many projects and publishes numerous reports covering a variety of topics related to Israel’s technological, economic and social capabilities.


What types of evaluation programs does SNI develop and conduct?

The institute is often called upon to provide evaluations of specific programs or institutions in Israel. Some examples of such evaluative research are:

1. Program evaluation:
In some cases, SNI is requested to evaluate specific scientific programs, for example, the Scientific Infrastructure Program of the Ministry of Science and Technology, which was launched in 1995 in an attempt to bridge the gap between basic and applied research. SNI was called to methodologically evaluate how and to what extent this program benefitted the Israeli economy and society. In addition, the institute set out to study the effectiveness of the program, its actual successes and failures, and to help decision makers set priorities in R&D policies and investments.

2. Evaluation of R&D programs supported by the Office of the Chief Scientist (OCS):
The OCS supports several scientific programs aimed to support technology transfer between academia and research institutions and the industry. SNI was called to evaluate some of these programs and analyze their effectiveness, success and future development to ensure well-constructed processes for technology transfer to industry.
3.Evaluation of individual institutions:
From time to time, SNI is called upon to evaluate specific institutions within academia. In such cases SNI uses quantitative and qualitative methodologies to evaluate their performance in terms of output, influence and contribution to science, economy and society.

4. Evaluation of the Israeli research output:
Since 2003, the institute uses advanced bibliometric methodologies and conducts in-depth studies on the quality and quantity of Israeli research outputs (especially relating to scientific publications and patent analysis). Specific fields such as Nanoscience and Nanotechnology, Aerospace Engineering, Energy, Environment, and Stem Cells are analyzed and benchmarked against the rest of the world.

 

What data does the institute collect and analyze in order to produce reports on Israel’s STI capabilities?

SNI uses a variety of data sources in order to conduct its research and produce its reports, including intellectual property (such as patents and trademarks), human resources and demographics, as well as infrastructure and economic indicators. In addition, SNI established a Bibliometric department, which focuses on analyzing publication data such as number of journal articles, number of citations, conferences etc., as well as scientific collaborations with the international community.

 

Which indicators did the institute develop in order to be able to benchmark Israel’s STI?

SNI developed and maintains a large and diverse database of indicators relating to the monitoring and evaluation of R&D activities, scientific capabilities and technological infrastructures and to the funding of such activities in Israel. This database has become the most reliable and trusted source for STI evaluation in the country. In 2013, SNI published the fourth edition of “Indices of Science, Technology and Innovation in Israel: An International Comparison”. It contains key data on Israel's Science and Technology input and output and covers more than a decade of international comparisons, as well as many other indices, including position indicators. In the framework of patent research, SNI developed the "distinct invention" indicator. This indicator is based on patent family data and is aimed at neutralizing double counting of identical patent applications (inventions) as a result of their filing in numerous patent offices around the world.

 

Please list some of the main findings of the latest report on Israel’s STI on the following:

1.  Leading disciplines by quality:
According to the latest report, Israel’s leading scientific disciplines are Space Science, Material Sciences, Molecular Biology & Genetics, and Biology & Biochemistry. Leading sub-disciplines are Cell & Tissue Engineering, Biomaterials, Biophysics, Biochemistry & Molecular Biology, Biomedical Engineering, Composite Materials, and Nanotechnology. A significant growth by quantity was seen in disciplines such as Economics and Social Sciences.

2. Developing disciplines:
Some of the leading trends found, based on both quantitative and qualitative measures, are Tissue Engineering, Physics (Particles & Fields), Astronomy & Astrophysics, Cell Biology, and Biochemistry & Molecular Biology. In some of the sub-disciplines within these areas of research, Israel has a leading global role.

3. Main collaboration trends worldwide:
Overall, of Israel’s scientific publications in 2011, 46% was the result of international collaboration (40% in 2007). The main countries with which Israeli scientists collaborate are the USA, Germany and France. In addition to these, we found a significant growth in collaborations with South East Asian countries such as Singapore. An analysis of USPTO patent data relating to the 1999-2008 time period revealed that 83% of the cooperation in inventive activity was conducted with American inventors (highly influenced by the scope of US multinational firms’ activities in Israel), 10% with inventors from EU-27 countries (mainly Germany, France and the UK) and 7% with inventors from the rest of the world.

4. Main challenges in the current state of Israel’s STI and your recommendations:
An appropriate distribution of funding is always a challenge for decision makers. In our report we demonstrated that although highly funded disciplines such as Neuroscience did perform well, other - less funded - areas such as Space Science and Cell & Tissue Engineering showed significant growth and development. This enabled us to highlight areas that will need policy and funding attention in the coming years.

 

SNI produces numerous studies on Israel’s STI; could you please mention one or two of such studies (e.g. environmental conservation, energy) and their main results?

One of the research reports we produced in 2013 was “Science & Technology Education in Israel”, which aimed to provide indicators to inform strategy makers in education, and to help prepare them for a possible shortage in Science and Technology teachers in high schools. A unique report titled “Success stories” features 78 success stories that depict ultra-orthodox individuals in Israel, both men and women, who have successfully integrated into the world of academic education, employment and the military. Another "hot" topic is Energy; we have an ongoing project named "Energy Master Plan", responsible for evaluating the environmental impacts of the different potential energy scenarios as well defining environmental indicators to the energy market. The Energy Forum Meetings aim to provide a platform where professionals can discuss specific energy related topics. At the same time, the forum allows multilateral discussions encouraging projects in the fields of renewable energy and energy conservation. The forum meetings serve as a platform for defining professional, applicable positions, to be used by relevant decision makers. Other reports and findings can be found on our website: http://www.neaman.org.il/Publications.

 

Given the variable delays and uncertain linkages between R&D inputs and outputs (and ultimately, economic development), how do you draw conclusions (if indeed you do) on the impact of STI activities on the Israeli economy?

The question of causation or causality between R&D inputs and economic outputs is a well-known and researched problem in the R&D economic literature. The main criticism is that a large number of models dealing with the relationship between technological change and economic growth probe the linkage directly by simply looking at the inputs (e.g. scientific publications, patents) and outputs (e.g. firm sales, GDP), without analyzing or understanding the process binding them.

In the process of our work in SNI, we place great emphasis on qualitative methodologies (interviews, surveys and unstructured questionnaires using open-ended questions) that to our best knowledge are better suited to understanding and probing the mechanism (the "black box") linking scientific inputs and economic performance.

A number of quantitative studies dealing with the relationship between R&D investments and economic growth were conducted in SNI (see “R&D Outputs in Israel – A Comparative Analysis of PCT Applications and Distinct Israeli Inventions”; “Investments in Higher Education and the Economic Performance of OECD Countries: Israel in an International Perspective”). In both of these studies we addressed the question of causality by developing a two-stage model of scientific and technological innovation. In this model R&D investments generate scientific and technological outputs (e.g. patents) and these technological outputs turn back into inputs which explain economic performance. In the process of this work much emphasis was placed on the quality of the R&D indicators. For example, we extracted patent application data by priority date (which is the earliest filing date of the patent application anywhere in the world), as opposed to application or grant date, in order to more accurately represent the time of invention. Concurrently, the use of temporal bias (time lag) between R&D inputs and economic outputs is actually essential to correctly represent the real-world relationship and sequence between stimulus and response.
Currently, the institute’s investigators are working on several reports focusing on technology transfer and collaboration between industry and academia, international scientific collaborations, and energy sources.

For more information please visit http://www.neaman.org.il/Science-and-technology

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

In 2013, the Samuel Neaman Institute published a report covering the Israeli scientific output from 1990 to 2011, identifying the country’s leading scientific disciplines and comparing them to countries around the world. With its unique geographical location and demographic composition, Israel presents an interesting case of scientific capabilities and output as well as collaborative trends. For this issue, we interviewed Daphne Getz, the lead investigator of this report.

Dr Daphne Getz

Dr. Daphne Getz is the head of the CESTIP (Center of Excellence in Science, Technology and Innovation Policies), and has been a senior research fellow at the Samuel Neaman Institute (SNI) since 1996. Dr. Getz is a specialist in R&D policy, technology and innovation, policies on new and emerging technologies, and relationships between academia, industry and government, among others. She has represented the academia and the Technion (Israel Institute of Technology) in the MAGNET R&D Consortia and also represents Israeli academia in several EU and UN projects. She has a D.Sc. from the Technion in Physical Chemistry and has served in several positions related to R&D management in the industry. Over the years, Dr. Getz has initiated numerous projects, including Israeli indicators for Science, Technology and Innovation, evaluation of R&D programs, and the evaluation of Israeli R&D outputs using Bibliometrics.

 

Could you briefly describe SNI (Samuel Neaman Institute), its core activities and role in informing science policy in Israel?

Samuel Neaman Institute (SNI) is an Israeli organization established in 1978 at the Technion (the Israel Institute of Technology). Its main objective is to conduct independent multi-disciplinary research and provide insights into Israel’s Science, Technology & Innovation (STI), education, economy and industry as well as infrastructure and social development for policy makers. The institute has a key role in outlining Israel’s national policies in science, technology and higher education and serves decision makers through its research projects and surveys. The institute operates within the framework of a budget funded by Mr. Samuel Neaman and external research grants from the Ministry of Science, Technology and Space, the Office of the Chief Scientist in the Ministry of Economy, the Ministry for Environmental Protection, the European Commission’s Seventh Framework Programme grants, and more. SNI employees are highly professional analysts chosen because of their level of expertise in different disciplines. Each year, the institute conducts many projects and publishes numerous reports covering a variety of topics related to Israel’s technological, economic and social capabilities.


What types of evaluation programs does SNI develop and conduct?

The institute is often called upon to provide evaluations of specific programs or institutions in Israel. Some examples of such evaluative research are:

1. Program evaluation:
In some cases, SNI is requested to evaluate specific scientific programs, for example, the Scientific Infrastructure Program of the Ministry of Science and Technology, which was launched in 1995 in an attempt to bridge the gap between basic and applied research. SNI was called to methodologically evaluate how and to what extent this program benefitted the Israeli economy and society. In addition, the institute set out to study the effectiveness of the program, its actual successes and failures, and to help decision makers set priorities in R&D policies and investments.

2. Evaluation of R&D programs supported by the Office of the Chief Scientist (OCS):
The OCS supports several scientific programs aimed to support technology transfer between academia and research institutions and the industry. SNI was called to evaluate some of these programs and analyze their effectiveness, success and future development to ensure well-constructed processes for technology transfer to industry.
3.Evaluation of individual institutions:
From time to time, SNI is called upon to evaluate specific institutions within academia. In such cases SNI uses quantitative and qualitative methodologies to evaluate their performance in terms of output, influence and contribution to science, economy and society.

4. Evaluation of the Israeli research output:
Since 2003, the institute uses advanced bibliometric methodologies and conducts in-depth studies on the quality and quantity of Israeli research outputs (especially relating to scientific publications and patent analysis). Specific fields such as Nanoscience and Nanotechnology, Aerospace Engineering, Energy, Environment, and Stem Cells are analyzed and benchmarked against the rest of the world.

 

What data does the institute collect and analyze in order to produce reports on Israel’s STI capabilities?

SNI uses a variety of data sources in order to conduct its research and produce its reports, including intellectual property (such as patents and trademarks), human resources and demographics, as well as infrastructure and economic indicators. In addition, SNI established a Bibliometric department, which focuses on analyzing publication data such as number of journal articles, number of citations, conferences etc., as well as scientific collaborations with the international community.

 

Which indicators did the institute develop in order to be able to benchmark Israel’s STI?

SNI developed and maintains a large and diverse database of indicators relating to the monitoring and evaluation of R&D activities, scientific capabilities and technological infrastructures and to the funding of such activities in Israel. This database has become the most reliable and trusted source for STI evaluation in the country. In 2013, SNI published the fourth edition of “Indices of Science, Technology and Innovation in Israel: An International Comparison”. It contains key data on Israel's Science and Technology input and output and covers more than a decade of international comparisons, as well as many other indices, including position indicators. In the framework of patent research, SNI developed the "distinct invention" indicator. This indicator is based on patent family data and is aimed at neutralizing double counting of identical patent applications (inventions) as a result of their filing in numerous patent offices around the world.

 

Please list some of the main findings of the latest report on Israel’s STI on the following:

1.  Leading disciplines by quality:
According to the latest report, Israel’s leading scientific disciplines are Space Science, Material Sciences, Molecular Biology & Genetics, and Biology & Biochemistry. Leading sub-disciplines are Cell & Tissue Engineering, Biomaterials, Biophysics, Biochemistry & Molecular Biology, Biomedical Engineering, Composite Materials, and Nanotechnology. A significant growth by quantity was seen in disciplines such as Economics and Social Sciences.

2. Developing disciplines:
Some of the leading trends found, based on both quantitative and qualitative measures, are Tissue Engineering, Physics (Particles & Fields), Astronomy & Astrophysics, Cell Biology, and Biochemistry & Molecular Biology. In some of the sub-disciplines within these areas of research, Israel has a leading global role.

3. Main collaboration trends worldwide:
Overall, of Israel’s scientific publications in 2011, 46% was the result of international collaboration (40% in 2007). The main countries with which Israeli scientists collaborate are the USA, Germany and France. In addition to these, we found a significant growth in collaborations with South East Asian countries such as Singapore. An analysis of USPTO patent data relating to the 1999-2008 time period revealed that 83% of the cooperation in inventive activity was conducted with American inventors (highly influenced by the scope of US multinational firms’ activities in Israel), 10% with inventors from EU-27 countries (mainly Germany, France and the UK) and 7% with inventors from the rest of the world.

4. Main challenges in the current state of Israel’s STI and your recommendations:
An appropriate distribution of funding is always a challenge for decision makers. In our report we demonstrated that although highly funded disciplines such as Neuroscience did perform well, other - less funded - areas such as Space Science and Cell & Tissue Engineering showed significant growth and development. This enabled us to highlight areas that will need policy and funding attention in the coming years.

 

SNI produces numerous studies on Israel’s STI; could you please mention one or two of such studies (e.g. environmental conservation, energy) and their main results?

One of the research reports we produced in 2013 was “Science & Technology Education in Israel”, which aimed to provide indicators to inform strategy makers in education, and to help prepare them for a possible shortage in Science and Technology teachers in high schools. A unique report titled “Success stories” features 78 success stories that depict ultra-orthodox individuals in Israel, both men and women, who have successfully integrated into the world of academic education, employment and the military. Another "hot" topic is Energy; we have an ongoing project named "Energy Master Plan", responsible for evaluating the environmental impacts of the different potential energy scenarios as well defining environmental indicators to the energy market. The Energy Forum Meetings aim to provide a platform where professionals can discuss specific energy related topics. At the same time, the forum allows multilateral discussions encouraging projects in the fields of renewable energy and energy conservation. The forum meetings serve as a platform for defining professional, applicable positions, to be used by relevant decision makers. Other reports and findings can be found on our website: http://www.neaman.org.il/Publications.

 

Given the variable delays and uncertain linkages between R&D inputs and outputs (and ultimately, economic development), how do you draw conclusions (if indeed you do) on the impact of STI activities on the Israeli economy?

The question of causation or causality between R&D inputs and economic outputs is a well-known and researched problem in the R&D economic literature. The main criticism is that a large number of models dealing with the relationship between technological change and economic growth probe the linkage directly by simply looking at the inputs (e.g. scientific publications, patents) and outputs (e.g. firm sales, GDP), without analyzing or understanding the process binding them.

In the process of our work in SNI, we place great emphasis on qualitative methodologies (interviews, surveys and unstructured questionnaires using open-ended questions) that to our best knowledge are better suited to understanding and probing the mechanism (the "black box") linking scientific inputs and economic performance.

A number of quantitative studies dealing with the relationship between R&D investments and economic growth were conducted in SNI (see “R&D Outputs in Israel – A Comparative Analysis of PCT Applications and Distinct Israeli Inventions”; “Investments in Higher Education and the Economic Performance of OECD Countries: Israel in an International Perspective”). In both of these studies we addressed the question of causality by developing a two-stage model of scientific and technological innovation. In this model R&D investments generate scientific and technological outputs (e.g. patents) and these technological outputs turn back into inputs which explain economic performance. In the process of this work much emphasis was placed on the quality of the R&D indicators. For example, we extracted patent application data by priority date (which is the earliest filing date of the patent application anywhere in the world), as opposed to application or grant date, in order to more accurately represent the time of invention. Concurrently, the use of temporal bias (time lag) between R&D inputs and economic outputs is actually essential to correctly represent the real-world relationship and sequence between stimulus and response.
Currently, the institute’s investigators are working on several reports focusing on technology transfer and collaboration between industry and academia, international scientific collaborations, and energy sources.

For more information please visit http://www.neaman.org.il/Science-and-technology

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)