Articles

Research trends is an online magazine providing objective insights into scientific trends based on bibliometrics analyses.

Graphene: ten years of the ‘gold rush’

In this article, Andrew Plume investigates whether a new approach to assigning ‘credit’ for article authorship can answer the question: “Who are the authors of high-impact graphene research”?

Read more >


Since the publication of the famous paper (1)on the ‘sticky-tape method’ for preparing graphene in October 2004 (which helped win authors, Andre Geim and Konstantin Novoselov the 2010 Nobel Prize in Physics), the field of graphene research has seen phenomenal growth in terms of published research articles likened to a ‘gold rush’ (2). With the entrance of so many new and established researchers into the field, we investigate if a new approach to assigning ‘credit’ for article authorship can answer the question: “Who are the authors of high-impact graphene research”?

Graphene is a material comprising carbon atoms packed together in a two-dimensional sheet just one atom thick, and may be the thinnest material in the universe. This unique structure gives graphene some very surprising physical properties – it is some 100 times stronger than steel and conducts heat and electricity at high efficiency. Prior to its isolation by Geim and Novoselov in 2004, it existed only in theoretical models; as such, the field of graphene research can be considered to have appeared almost overnight.

Figure 1 shows the exponential increase in the number of research articles published on graphene in the decade between 2004 and 2013. Using this corpus of literature as a self-defining research field, we have applied a recently-published method for assigning authorship credit to understand who the high-impact authors in graphene research are. Most current approaches to identifying and ranking high-impact authors fail to account for the invisible credit structures which operate in author bylines in most fields of research. Instead, most analyses assume that each author has a full and equal stake in the creation of a research article, and this follows to the assignment of the credit for that article also. While much previous work has been done to examine the intricacies of fractional assignment of credit to authors (e.g. Moed (3) and Stallings et al. (4)), there has recently been renewed interest in algorithmic methods to fractionally assign authorship credit in a way that recognises these unstated community norms. Some of the most recent work along these lines has been published by Nils T. Hagen at the University of Nordland, Norway, and it is this approach which serves at the inspiration for the present study (5).

Graphene fig1

Figure 1 - Scholarly output (articles only) published in the period 2004-13 from a search for “graphene” in the titles, abstracts or keywords. Source: SciVal.

The present study aims to compare three methods of assigning authorship credit to the authors of the corpus of research articles on graphene defined above and examine the differences in the resulting lists of high-impact researchers. The first method is the standard ‘full count’ method – each author on the article receives a full count for each article they appear on, and also the full citation credit. The second method is ‘fractional’, where each author gets an equal portion of the credit with all other co-authors; an author on a single-author paper gets 1, while one on a 4-author paper gets 0.25; citation credit is assigned in the same way. For an examination of the rise of fractional authorship over time, see “Publish or perish? The rise of the fractional author…” , also in this issue (6).  Finally, the ‘harmonic’ method (as developed by Hagen, (5)) instead assigns additional weight to the first and last authors and diminishing weights to each additional author in the middle, and assigns citations same way also. As a vital and important research front, graphene research is typically published in well-known peer-reviewed journals and as such we have assumed that all of the most important research (and researchers) in this topic are represented in the Scopus database.

Citations in this analysis are counted on a 3-year basis; i.e. citations to each article are counted in the same year as publication plus the two following years; i.e. 2011 papers have their cites counted in the period 2011-2013; since the field is therefore self-defining, it is not necessary to field-weight the citation data as we may assume that citation practices within graphene research are reasonably homogenous. Because of the use of this 3-year citation window, this analysis considers only those articles published from 2005 to 2011, focussing on the period of expansion of the field in the wake of Geim and Novoselov’s landmark 2004 publication (1). Importantly, since the corpus is defined as research articles containing the word “graphene” in the title, abstract or keywords, it ignores all other articles on non-graphene topics published by the same authors; by design, these results answer the very specific question “who are the authors of high-impact graphene research?”, and not “who are the high-impact authors working on graphene?

If each author on every paper is represented in this analysis, when these lists are sorted by citations per article many of those appearing at the top are authors of single well-cited papers who may not (yet) represent career researchers. To account for this, a productivity threshold was applied to allow authors with relatively lower productivity in graphene research to appear in these lists; in Figure 2 this was set at a relatively ‘relaxed’ minimum of 7 articles in the 7 year period 2005-11 (i.e. on average, 1 article per author per year) for the full count method, and at 2 authorship credits for the fractional and harmonic methods (i.e. on average, less than 0.3 article credits per author per year).

Graphene fig2

Figure 2 -  Top 25 authors of graphene articles 2005-11: ‘relaxed’ productivity threshold. Source: Scopus.

It is clear from a glance that while the three methods have a few authors in common, where the same author does appear in more than one list their rankings are quite variable (see for instance the variability in ranking of the two Nobelists Geim and Novoselov in each list, for example). It is also clear that at this ‘relaxed’ productivity threshold, authors who are newer to the field are likely to appear but may not be as well-recognised as leading figures in the field by other graphene researchers.

In Figure 3, the productivity threshold was increased to focus only on authors with relatively high productivity in graphene research; this ‘stringent’ threshold was set at a minimum of 28 articles in the 7 year period 2005-11 (i.e. on average, 4 articles per author per year) for the full count method, and at 7 authorship credits for the fractional and harmonic methods (i.e. on average, 1 article credit per author per year). In these lists there is a somewhat greater degree of agreement between the results overall than in the ‘relaxed’ threshold lists, but especially for the very top names (the two Nobelists head all three lists, for example); below that, the three lists begin to differ and names in one or two lists are absent from the other(s).

Graphene fig3

Figure 3 - Top 25 authors of graphene articles 2005-11: ‘stringent’ productivity threshold. Source: Scopus.

It is difficult for anyone not working directly in a field of research to know who the ‘best’ researchers working in that field are, and recognising this we have not sought to make a value judgement here on the which list correlates most closely with peer esteem. Instead, the question remains open to those working on graphene to answer: which researchers are recognised as the ‘highest impact’ in the field, and which list reflects this most closely?

As early as 2008, Andre Geim himself has noted the tendency for graphene to attract large numbers of researchers: “With graphene, each year brings a new result, a new sub-area of research that opens up and sparks a gold rush” (6). Here we have applied a fresh approach to assigning author credit for published research articles to the field of graphene as one way of demonstrating who has made their fortune on the research frontier. It is important to note however that, owing to the inherent complexity in the research enterprise (especially at the frontier of knowledge), simplistic interpretations of author rankings may be dangerous insofar as they may reinforce the status quo and lead to a form of consensus-reaching which may ultimately limit the expansion of knowledge. Instead - as always - metrics informed by expert opinion are preferable.

 

References

(1) Novoselov, K.S., Geim, A.K., Morozov, S.V., Jiang, D., Zhang, Y., Dubonos, S.V., Grigorieva, I.V., Firsov, A.A. (2004) “Electric field in atomically thin carbon films”, Science, vol. 306, issue 5696, pp. 666-669.
(2) Plume, A., (2010) Buckyballs, nanotubes and graphene: On the hunt for the next big thing, Research Trends issue 18, July 2010, https://www.researchtrends.com/issue18-july-2010/research-trends-12/.
(3) Moed, H.F. (2000) “Bibliometric Indicators Reflect Publication and Management Strategies” Scientometrics 47(2) pp. 323-346
(4) Stallings, J., Vance, E., Yang, J., Vannier, M.W., Liang, J., Pang, L., Liang Dai, Ye, I., and Wang, G. (2013) “Determining scientific impact using a collaboration index”, Proceedings of the National Academy of Sciences (doi:10.1073/pnas.1220184110)
(5) Hagen, N.T. (2014) “Counting and comparing publication output with and without equalizing and inflationary bias” Journal of Informetrics 8(2) pp. 310-317.
(6) Plume, A. & Van Weijen, D. (2014) Publish or perish? The rise of the fractional author…, Research Trends Issue 38, September 2014. 
(7) http://sciencewatch.com/articles/andre-k-geim-interview
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Since the publication of the famous paper (1)on the ‘sticky-tape method’ for preparing graphene in October 2004 (which helped win authors, Andre Geim and Konstantin Novoselov the 2010 Nobel Prize in Physics), the field of graphene research has seen phenomenal growth in terms of published research articles likened to a ‘gold rush’ (2). With the entrance of so many new and established researchers into the field, we investigate if a new approach to assigning ‘credit’ for article authorship can answer the question: “Who are the authors of high-impact graphene research”?

Graphene is a material comprising carbon atoms packed together in a two-dimensional sheet just one atom thick, and may be the thinnest material in the universe. This unique structure gives graphene some very surprising physical properties – it is some 100 times stronger than steel and conducts heat and electricity at high efficiency. Prior to its isolation by Geim and Novoselov in 2004, it existed only in theoretical models; as such, the field of graphene research can be considered to have appeared almost overnight.

Figure 1 shows the exponential increase in the number of research articles published on graphene in the decade between 2004 and 2013. Using this corpus of literature as a self-defining research field, we have applied a recently-published method for assigning authorship credit to understand who the high-impact authors in graphene research are. Most current approaches to identifying and ranking high-impact authors fail to account for the invisible credit structures which operate in author bylines in most fields of research. Instead, most analyses assume that each author has a full and equal stake in the creation of a research article, and this follows to the assignment of the credit for that article also. While much previous work has been done to examine the intricacies of fractional assignment of credit to authors (e.g. Moed (3) and Stallings et al. (4)), there has recently been renewed interest in algorithmic methods to fractionally assign authorship credit in a way that recognises these unstated community norms. Some of the most recent work along these lines has been published by Nils T. Hagen at the University of Nordland, Norway, and it is this approach which serves at the inspiration for the present study (5).

Graphene fig1

Figure 1 - Scholarly output (articles only) published in the period 2004-13 from a search for “graphene” in the titles, abstracts or keywords. Source: SciVal.

The present study aims to compare three methods of assigning authorship credit to the authors of the corpus of research articles on graphene defined above and examine the differences in the resulting lists of high-impact researchers. The first method is the standard ‘full count’ method – each author on the article receives a full count for each article they appear on, and also the full citation credit. The second method is ‘fractional’, where each author gets an equal portion of the credit with all other co-authors; an author on a single-author paper gets 1, while one on a 4-author paper gets 0.25; citation credit is assigned in the same way. For an examination of the rise of fractional authorship over time, see “Publish or perish? The rise of the fractional author…” , also in this issue (6).  Finally, the ‘harmonic’ method (as developed by Hagen, (5)) instead assigns additional weight to the first and last authors and diminishing weights to each additional author in the middle, and assigns citations same way also. As a vital and important research front, graphene research is typically published in well-known peer-reviewed journals and as such we have assumed that all of the most important research (and researchers) in this topic are represented in the Scopus database.

Citations in this analysis are counted on a 3-year basis; i.e. citations to each article are counted in the same year as publication plus the two following years; i.e. 2011 papers have their cites counted in the period 2011-2013; since the field is therefore self-defining, it is not necessary to field-weight the citation data as we may assume that citation practices within graphene research are reasonably homogenous. Because of the use of this 3-year citation window, this analysis considers only those articles published from 2005 to 2011, focussing on the period of expansion of the field in the wake of Geim and Novoselov’s landmark 2004 publication (1). Importantly, since the corpus is defined as research articles containing the word “graphene” in the title, abstract or keywords, it ignores all other articles on non-graphene topics published by the same authors; by design, these results answer the very specific question “who are the authors of high-impact graphene research?”, and not “who are the high-impact authors working on graphene?

If each author on every paper is represented in this analysis, when these lists are sorted by citations per article many of those appearing at the top are authors of single well-cited papers who may not (yet) represent career researchers. To account for this, a productivity threshold was applied to allow authors with relatively lower productivity in graphene research to appear in these lists; in Figure 2 this was set at a relatively ‘relaxed’ minimum of 7 articles in the 7 year period 2005-11 (i.e. on average, 1 article per author per year) for the full count method, and at 2 authorship credits for the fractional and harmonic methods (i.e. on average, less than 0.3 article credits per author per year).

Graphene fig2

Figure 2 -  Top 25 authors of graphene articles 2005-11: ‘relaxed’ productivity threshold. Source: Scopus.

It is clear from a glance that while the three methods have a few authors in common, where the same author does appear in more than one list their rankings are quite variable (see for instance the variability in ranking of the two Nobelists Geim and Novoselov in each list, for example). It is also clear that at this ‘relaxed’ productivity threshold, authors who are newer to the field are likely to appear but may not be as well-recognised as leading figures in the field by other graphene researchers.

In Figure 3, the productivity threshold was increased to focus only on authors with relatively high productivity in graphene research; this ‘stringent’ threshold was set at a minimum of 28 articles in the 7 year period 2005-11 (i.e. on average, 4 articles per author per year) for the full count method, and at 7 authorship credits for the fractional and harmonic methods (i.e. on average, 1 article credit per author per year). In these lists there is a somewhat greater degree of agreement between the results overall than in the ‘relaxed’ threshold lists, but especially for the very top names (the two Nobelists head all three lists, for example); below that, the three lists begin to differ and names in one or two lists are absent from the other(s).

Graphene fig3

Figure 3 - Top 25 authors of graphene articles 2005-11: ‘stringent’ productivity threshold. Source: Scopus.

It is difficult for anyone not working directly in a field of research to know who the ‘best’ researchers working in that field are, and recognising this we have not sought to make a value judgement here on the which list correlates most closely with peer esteem. Instead, the question remains open to those working on graphene to answer: which researchers are recognised as the ‘highest impact’ in the field, and which list reflects this most closely?

As early as 2008, Andre Geim himself has noted the tendency for graphene to attract large numbers of researchers: “With graphene, each year brings a new result, a new sub-area of research that opens up and sparks a gold rush” (6). Here we have applied a fresh approach to assigning author credit for published research articles to the field of graphene as one way of demonstrating who has made their fortune on the research frontier. It is important to note however that, owing to the inherent complexity in the research enterprise (especially at the frontier of knowledge), simplistic interpretations of author rankings may be dangerous insofar as they may reinforce the status quo and lead to a form of consensus-reaching which may ultimately limit the expansion of knowledge. Instead - as always - metrics informed by expert opinion are preferable.

 

References

(1) Novoselov, K.S., Geim, A.K., Morozov, S.V., Jiang, D., Zhang, Y., Dubonos, S.V., Grigorieva, I.V., Firsov, A.A. (2004) “Electric field in atomically thin carbon films”, Science, vol. 306, issue 5696, pp. 666-669.
(2) Plume, A., (2010) Buckyballs, nanotubes and graphene: On the hunt for the next big thing, Research Trends issue 18, July 2010, https://www.researchtrends.com/issue18-july-2010/research-trends-12/.
(3) Moed, H.F. (2000) “Bibliometric Indicators Reflect Publication and Management Strategies” Scientometrics 47(2) pp. 323-346
(4) Stallings, J., Vance, E., Yang, J., Vannier, M.W., Liang, J., Pang, L., Liang Dai, Ye, I., and Wang, G. (2013) “Determining scientific impact using a collaboration index”, Proceedings of the National Academy of Sciences (doi:10.1073/pnas.1220184110)
(5) Hagen, N.T. (2014) “Counting and comparing publication output with and without equalizing and inflationary bias” Journal of Informetrics 8(2) pp. 310-317.
(6) Plume, A. & Van Weijen, D. (2014) Publish or perish? The rise of the fractional author…, Research Trends Issue 38, September 2014. 
(7) http://sciencewatch.com/articles/andre-k-geim-interview
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

A decade’s trends in virology research

Matthew Richardson illustrates the trends that have influenced the field of Virology, the study of viruses, over the past 10 years, using bibliometric analysis and visualization techniques.

Read more >


One advantage bibliometric analysis brings is the ability to put a large quantity of research into perspective. Papers can of course be read individually, and the use of cited references in the literature allows an interested reader to get a wider background on the specific concepts found within, and how the understanding of these has changed over time. However, the sheer scale of work produced in a given field means that the only way to illustrate the broadest trends affecting an entire field is through analyzing the bibliographic data of these papers in bulk. In this article we illustrate the trends that have influenced the field of Virology, the study of viruses, over the past 10 years.

Visualizing the topics in Virology

In an earlier issue of Research Trends we introduced term maps as a method for exploring the topics published in a group of journals (1). These maps, developed in collaboration with the CWTS research group, present a two-dimensional view of the topical terms used in the titles and abstracts of a publication; when aggregated across a journal, or a large group of journals, you can then make use of the fact that a term is more likely to appear in the same paper as a related term to group together those which are most highly related. Using all of the textual data available in titles and abstracts, this allows you to produce a thorough view of which topics are researched and how they interact with one another to form the broader structure of a field.

In the term maps following, we use all journals that are categorized in Scopus within the Virology subject category. Although it is still possible that virology-related content is published outside these journals, for instance in a broad-based Medicine or Microbiology journal, this analysis catches the great majority of relevant research across a wider range of journals than a small selection would allow.

As we wish to compare the field at a gap of 10 years’ time, we have used the two time periods 2000–02 and 2010–12. The use of three consecutive years of publications in each map allows us to obtain a more thorough view of what is being published, and so to use more accurate co-occurrence relationships between terms in the maps.

The term maps and selection of topics

Figure 1 shows a term map for Virology content published in the years 2000–02. This covers 14,158 articles, reviews and conference papers. This map is a co-occurrence cluster map, showing both the position of each term (the relative location is determined by their co-occurrence in title and abstracts, so that the closer the terms are positioned the more often they tend to co-occur) and the main cluster they belong to (distinguished by one of four colors). The final element of the data shown is the frequency with which a term is found in this field: the larger the term appears, the more papers contain that term within the title or abstract.

Virology fig 1

Figure 1 – Journal term co-occurrence map for the field of Virology, using a set of 14,158 papers published from 2000 to 2002. Colors used to distinguish clusters of related terms. Data source: Scopus

This map forms a circular structure which is common to many such networks, and is composed of four main groupings of topics. The most common terms are those relating to primary care and clinical research in the green cluster (‘patient’, ‘case’, ‘therapy’); epidemiology, outbreak investigation and phylogenetics in the blue cluster (‘isolate’, ‘genotype’, ‘phylogenetic analysis’, ‘outbreak’); molecular biology and genetics in the red cluster (‘transcription’, ‘open reading frame’, ‘nucleotide’), and cell biology of disease in the yellow cluster (‘T cell’, ‘IFN’, ‘CD4’).

Figure 2 shows a term map based on the same selection of journals, 10 years later: this includes 24,691 Virology papers published in 2010–12. This represents a huge increase in content over the earlier time period, with more than 10,000 additional papers. As might be expected, similar phrases appear as common terms: for instance, ‘patient’, ‘domain’, ‘case’, ‘isolate’.  More interesting are the broader changes in the structure of the field, and changing trends in the less frequent, more specific topics. Topics such as HCV (hepatitis C virus) and HPV (human papillomavirus) are far more visible in the center of the map, pointing to the increasing quantity but also interdisciplinarity of this research.

While the main clusters remain present and intact in this later map, the circular structure is not as contained; the green cluster relating to primary care and clinical research, and the yellow cluster relating to cell biology of disease, no longer link together quite so closely as in the 2000–02 period. This finding is surprising, given that in recent years we have seen a strong focus on interdisciplinary research, translational medicine and closing the loop between ‘bench’ research and ‘bedside’ care.

Virology fig 2

Figure 2 – Journal term co-occurrence map for the field of Virology, using a set of 24,691 papers published from 2010 to 2012. Colors used to distinguish clusters of related terms. Data source: Scopus

 

In Figure 3, selected virus-related terms have been identified and annotated on the 2010–12 Virology map. Rather than being confined to any particular cluster, these virus topics are scattered throughout the map according to the types of papers they occur in most frequently. This finding illustrates the fact that different virus families are predominantly used in very different kinds of studies, relating to the different clusters of the map. Related terms appear close to one another, as expected: for instance, hepatitis B and hepatitis C are close to one another, in the green (clinical) cluster, while influenza A is towards the top of the map along with the subtypes H5N1 and H1N1.

 

Virology fig 3

Figure 3 – Journal term co-occurrence map for the field of Virology, using a set of 24,691 papers published from 2010 to 2012. Colors used to distinguish clusters of related terms and annotations provided for selected virus-related terms. Data source: Scopus

 

As demonstrated here, term maps provide a useful overview of a field and allow you to examine the broader structural changes that affect it over time. In contrast, in the analysis that follows SciVal is used for more detailed analysis of individual topics with various metrics.

 

Research trends in the past decade

Taking some of the virus terms identified from our term map, it is possible to construct research areas in SciVal based around these topics and then compare them to one another by a variety of measures. One example is provided in Figure 4: here we see trends in scholarly output from 2004 to 2013 for five different research areas, covering research on hepatitis B and C, human papillomavirus, the H1N1 strain of influenza A, and coronavirus. The first three were included as they show high quantities of research but also extremely strong growth throughout the decade.  H1N1 on the other hand starts with minimal activity but then grows quickly to a peak of 568 papers in 2011. This growth in activity follows the 2009-10 H1N1 (swine flu) pandemic (2). Coronavirus research follows a different trend: while it starts relatively high in 2004 with more than 600 papers, it then declines steadily until there were fewer than 300 papers published in 2011. After this point there is another increase in activity, with 395 papers in 2013. The two different periods of higher interest in coronaviruses seem likely to be related to two distinct viruses: first SARS-CoV, a global epidemic which occurred in 2002–03; and towards the end of the period MERS-CoV, which was first identified in 2012 (3, 4).

 

Virology fig 4

Figure 4 – Trends in scholarly output for a selection of virus-related topics, counting articles, reviews and conference papers published per year. Source: SciVal

 

Field-weighted citation impact (FWCI) is a citation metric showing the citation activity around a group of papers, taking into account subject field, article type and year of publication, and so offering a robust comparison to the expected level of citation impact (which is assigned a level of 1.0). Looking across the full set of virus topics highlighted in Figure 3, three in particular stand out as having extremely strong spikes of citation impact in the past decade: the influenza A subtypes H5N1 and H1N1, and coronavirus. These times of activity coincide with the timing of public outbreaks even more closely than the publication trends shown in Figure 4. The year 2004, in which H5N1 research has an FWCI of over 10 times the expected level, saw major outbreaks of the virus strain across Asia (5, 6);  2009, in which H1N1 research reached an FWCI of 9.33 times the expected value, saw cases of the virus affecting people in the US and around the world (2); and the coronavirus MERS-CoV was first identified in 2012, coinciding with an upturn in impact continuing into 2013 and 2014 (which shows early signs of a similarly high FWCI but is not shown here due to the incompleteness of the data) (4).

 

Virology fig 5

Figure 5 – Trends in field-weighted citation impact for a selection of virus-related topics. Source: SciVal

 

Conclusion

While the publication and citation trends shown for specific virus topics reflect wider public interest at times of virus outbreaks, bibliometric analysis such as shown in this article allows for detailed comparison of the amount of research in different areas but also the way it is carried out. The insights available through term maps are even more difficult to draw from mainstream media or individual scholarly papers; using these visualizations we can view the full structure of a subject area and see how this has changed over time. Virology, a fast-moving field with topics that naturally rise and fall in interest as outbreaks occur, is particularly apt for this kind of illustration of hot topics over time.

 

References

(1) Van Weijen, D. (2013) “Trends in pediatrics: Overview of research trends from 2007–2011”, Research Trends, Issue 34, September 2013. Available at: https://www.researchtrends.com/issue-34-september-2013/trends-in-pediatrics/
(2) http://www.flu.gov/pandemic/history/
(3) http://www.who.int/ith/diseases/sars/en/
(4) http://www.who.int/csr/disease/coronavirus_infections/en/
(5) http://www.who.int/csr/don/2004_02_27/en/
(6) http://www.who.int/csr/don/2004_12_30/en/

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

One advantage bibliometric analysis brings is the ability to put a large quantity of research into perspective. Papers can of course be read individually, and the use of cited references in the literature allows an interested reader to get a wider background on the specific concepts found within, and how the understanding of these has changed over time. However, the sheer scale of work produced in a given field means that the only way to illustrate the broadest trends affecting an entire field is through analyzing the bibliographic data of these papers in bulk. In this article we illustrate the trends that have influenced the field of Virology, the study of viruses, over the past 10 years.

Visualizing the topics in Virology

In an earlier issue of Research Trends we introduced term maps as a method for exploring the topics published in a group of journals (1). These maps, developed in collaboration with the CWTS research group, present a two-dimensional view of the topical terms used in the titles and abstracts of a publication; when aggregated across a journal, or a large group of journals, you can then make use of the fact that a term is more likely to appear in the same paper as a related term to group together those which are most highly related. Using all of the textual data available in titles and abstracts, this allows you to produce a thorough view of which topics are researched and how they interact with one another to form the broader structure of a field.

In the term maps following, we use all journals that are categorized in Scopus within the Virology subject category. Although it is still possible that virology-related content is published outside these journals, for instance in a broad-based Medicine or Microbiology journal, this analysis catches the great majority of relevant research across a wider range of journals than a small selection would allow.

As we wish to compare the field at a gap of 10 years’ time, we have used the two time periods 2000–02 and 2010–12. The use of three consecutive years of publications in each map allows us to obtain a more thorough view of what is being published, and so to use more accurate co-occurrence relationships between terms in the maps.

The term maps and selection of topics

Figure 1 shows a term map for Virology content published in the years 2000–02. This covers 14,158 articles, reviews and conference papers. This map is a co-occurrence cluster map, showing both the position of each term (the relative location is determined by their co-occurrence in title and abstracts, so that the closer the terms are positioned the more often they tend to co-occur) and the main cluster they belong to (distinguished by one of four colors). The final element of the data shown is the frequency with which a term is found in this field: the larger the term appears, the more papers contain that term within the title or abstract.

Virology fig 1

Figure 1 – Journal term co-occurrence map for the field of Virology, using a set of 14,158 papers published from 2000 to 2002. Colors used to distinguish clusters of related terms. Data source: Scopus

This map forms a circular structure which is common to many such networks, and is composed of four main groupings of topics. The most common terms are those relating to primary care and clinical research in the green cluster (‘patient’, ‘case’, ‘therapy’); epidemiology, outbreak investigation and phylogenetics in the blue cluster (‘isolate’, ‘genotype’, ‘phylogenetic analysis’, ‘outbreak’); molecular biology and genetics in the red cluster (‘transcription’, ‘open reading frame’, ‘nucleotide’), and cell biology of disease in the yellow cluster (‘T cell’, ‘IFN’, ‘CD4’).

Figure 2 shows a term map based on the same selection of journals, 10 years later: this includes 24,691 Virology papers published in 2010–12. This represents a huge increase in content over the earlier time period, with more than 10,000 additional papers. As might be expected, similar phrases appear as common terms: for instance, ‘patient’, ‘domain’, ‘case’, ‘isolate’.  More interesting are the broader changes in the structure of the field, and changing trends in the less frequent, more specific topics. Topics such as HCV (hepatitis C virus) and HPV (human papillomavirus) are far more visible in the center of the map, pointing to the increasing quantity but also interdisciplinarity of this research.

While the main clusters remain present and intact in this later map, the circular structure is not as contained; the green cluster relating to primary care and clinical research, and the yellow cluster relating to cell biology of disease, no longer link together quite so closely as in the 2000–02 period. This finding is surprising, given that in recent years we have seen a strong focus on interdisciplinary research, translational medicine and closing the loop between ‘bench’ research and ‘bedside’ care.

Virology fig 2

Figure 2 – Journal term co-occurrence map for the field of Virology, using a set of 24,691 papers published from 2010 to 2012. Colors used to distinguish clusters of related terms. Data source: Scopus

 

In Figure 3, selected virus-related terms have been identified and annotated on the 2010–12 Virology map. Rather than being confined to any particular cluster, these virus topics are scattered throughout the map according to the types of papers they occur in most frequently. This finding illustrates the fact that different virus families are predominantly used in very different kinds of studies, relating to the different clusters of the map. Related terms appear close to one another, as expected: for instance, hepatitis B and hepatitis C are close to one another, in the green (clinical) cluster, while influenza A is towards the top of the map along with the subtypes H5N1 and H1N1.

 

Virology fig 3

Figure 3 – Journal term co-occurrence map for the field of Virology, using a set of 24,691 papers published from 2010 to 2012. Colors used to distinguish clusters of related terms and annotations provided for selected virus-related terms. Data source: Scopus

 

As demonstrated here, term maps provide a useful overview of a field and allow you to examine the broader structural changes that affect it over time. In contrast, in the analysis that follows SciVal is used for more detailed analysis of individual topics with various metrics.

 

Research trends in the past decade

Taking some of the virus terms identified from our term map, it is possible to construct research areas in SciVal based around these topics and then compare them to one another by a variety of measures. One example is provided in Figure 4: here we see trends in scholarly output from 2004 to 2013 for five different research areas, covering research on hepatitis B and C, human papillomavirus, the H1N1 strain of influenza A, and coronavirus. The first three were included as they show high quantities of research but also extremely strong growth throughout the decade.  H1N1 on the other hand starts with minimal activity but then grows quickly to a peak of 568 papers in 2011. This growth in activity follows the 2009-10 H1N1 (swine flu) pandemic (2). Coronavirus research follows a different trend: while it starts relatively high in 2004 with more than 600 papers, it then declines steadily until there were fewer than 300 papers published in 2011. After this point there is another increase in activity, with 395 papers in 2013. The two different periods of higher interest in coronaviruses seem likely to be related to two distinct viruses: first SARS-CoV, a global epidemic which occurred in 2002–03; and towards the end of the period MERS-CoV, which was first identified in 2012 (3, 4).

 

Virology fig 4

Figure 4 – Trends in scholarly output for a selection of virus-related topics, counting articles, reviews and conference papers published per year. Source: SciVal

 

Field-weighted citation impact (FWCI) is a citation metric showing the citation activity around a group of papers, taking into account subject field, article type and year of publication, and so offering a robust comparison to the expected level of citation impact (which is assigned a level of 1.0). Looking across the full set of virus topics highlighted in Figure 3, three in particular stand out as having extremely strong spikes of citation impact in the past decade: the influenza A subtypes H5N1 and H1N1, and coronavirus. These times of activity coincide with the timing of public outbreaks even more closely than the publication trends shown in Figure 4. The year 2004, in which H5N1 research has an FWCI of over 10 times the expected level, saw major outbreaks of the virus strain across Asia (5, 6);  2009, in which H1N1 research reached an FWCI of 9.33 times the expected value, saw cases of the virus affecting people in the US and around the world (2); and the coronavirus MERS-CoV was first identified in 2012, coinciding with an upturn in impact continuing into 2013 and 2014 (which shows early signs of a similarly high FWCI but is not shown here due to the incompleteness of the data) (4).

 

Virology fig 5

Figure 5 – Trends in field-weighted citation impact for a selection of virus-related topics. Source: SciVal

 

Conclusion

While the publication and citation trends shown for specific virus topics reflect wider public interest at times of virus outbreaks, bibliometric analysis such as shown in this article allows for detailed comparison of the amount of research in different areas but also the way it is carried out. The insights available through term maps are even more difficult to draw from mainstream media or individual scholarly papers; using these visualizations we can view the full structure of a subject area and see how this has changed over time. Virology, a fast-moving field with topics that naturally rise and fall in interest as outbreaks occur, is particularly apt for this kind of illustration of hot topics over time.

 

References

(1) Van Weijen, D. (2013) “Trends in pediatrics: Overview of research trends from 2007–2011”, Research Trends, Issue 34, September 2013. Available at: https://www.researchtrends.com/issue-34-september-2013/trends-in-pediatrics/
(2) http://www.flu.gov/pandemic/history/
(3) http://www.who.int/ith/diseases/sars/en/
(4) http://www.who.int/csr/disease/coronavirus_infections/en/
(5) http://www.who.int/csr/don/2004_02_27/en/
(6) http://www.who.int/csr/don/2004_12_30/en/

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

10 years of research impact: top cited papers in Scopus 2001-2011

Gali Halevi and Henk Moed investigate what the most frequently cited articles were in Scopus from 2001-2011, in eight main research areas, and give their authors the chance to comment on their achievements.

Read more >


Scopus is celebrating 10 years since its launch. As the largest abstract and citation database of peer-reviewed literature available today, Scopus boasts 53 million records, 21,915 titles from 5,000 publishers. In this paper we aimed to identify some of the top cited papers indexed in Scopus across various disciplines between 2001 and 2011. In addition, we contacted the authors of these papers to seek their insight about why they think their papers are as highly cited as they are.

In order to achieve this, we conducted a comprehensive search on all Scopus data, limiting the results to articles published between 2001 and 2011. Scopus is the largest abstract and citation database of peer-reviewed literature, and features smart tools to track, analyze and visualize research. The initial search results yielded more than 13 million records (as of June 11, 2014). This set was further refined, to include only full research articles while excluding reviews, editorials or book chapters. The search results were then limited to one of Scopus' 26 subject categories at a time (see Table 1 for full list). Each set of articles under a subject category was sorted by “cited by” counts (i.e. citations), which enables the highly cited articles to be identified.

In this paper we review the following 8 subject areas and their top cited articles:

  • Agricultural and Biological Sciences
  • Arts and Humanities
  • Computer Science
  • Chemical Engineering
  • Energy
  • Engineering
  • Environmental Science
  • Medicine

 

Agricultural and Biological Sciences

The top cited article between 2001 and 2011 in Agricultural and Biological Sciences is:
Tamura, K., Dudley, J., Nei, M., Kumar, S.MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0 (2007) Molecular Biology and Evolution, Vol. 24, No. 8, pp. 1596-1599.
Cited 17,359 times (as of June, 2014).

MEGA [Molecular Evolutionary Genetics Analysis] is a freely available software tool for conducting statistical analysis of molecular evolution and for constructing phylogenetic trees. MEGA is used by biologists in a large number of laboratories for reconstructing the evolutionary histories of species and inferring the extent and nature of the selective forces shaping the evolution of genes and species (1). This software was first developed by Sudhir Kumar and Koichiro Tamura in the laboratory of Dr. Masatoshi Nei (2). The first version of this software was released in 1993. As expected, the main disciplines citing this article are Agricultural and Biological Sciences, Biochemistry, Genetics and Molecular Biology, Immunology, Medicine and Veterinary Sciences. However, there are several interesting disciplines citing this software including Social Sciences, Arts and Humanities and Business, which may not seem directly related to the core research field of this software. A closer look at these citing disciplines reveals that the software has been used to track Ancient DNA in Anthropology and Archeology and to sketch the markup of civilization (3, 4) as well as study the phenomenon of the emergence and extinction of languages (5).

Comments from Prof. Kumar:

"This article described a useful software tool that enables comparative analysis of DNA and protein sequences from different individuals, strains, and species. Such analyses are becoming very important in this age of genomics, and increasingly larger numbers of scientists are using MEGA software to analyze their data."

Comments from Prof. Nei:

"MEGA4 is the fourth version of the MEGA, and in this version a new Maximum Composite Likelihood method of estimating evolutionary distances and other evolutionary parameters have been introduced. It has also been made usable in Linux and Intel-based Macintosh computers. Because of these new features, the MEGA4 article has been cited a large number of times. This improvement of the software was done primarily by Koichiro Tamura and Sudhir Kumar. Further improvement of the software was published later in the MEGA5 (2011) and MEGA6 (2013) articles."

 


Arts & Humanities

The top cited article between 2001 and 2011 in Arts & Humanities is:
McCall, L. The complexity of intersectionality (2005) Signs, Vol. 30, No. 3, pp. 1771-1800.
This article was cited 640 times (as of July, 2014).

This article discusses the complexity of studying the issue of intersectionality and offers different methods to do so. Intersectionality (or intersectionalism) is the study of intersections between forms or systems of oppression, domination or discrimination (6). The article was written by Leslie McCall, a professor at Northwestern University whose main areas of research include social inequality, economic and political sociology, methods, and social theory. This article is highly cited by research papers in Arts & Humanities and Social Sciences in the context of gender-related psychology, ethnic identity and feminism. Yet, it is also cited by Business and Management research focusing on women’s careers in business (7), workplace diversity (8) and women’s leadership skills development (9). Another interesting discipline citing this paper is Environmental Sciences, which refers to it in the context of gender-related client change adaptation (10) and gender migration patterns (11), to name two examples.

 Comments from Dr. McCall:

"I believe [the high citation count] has to do with interdisciplinary interest in the issue of intersectionality across a wide range of fields. I try to extend the usefulness of the concept for quantitative as well as qualitative research. The latter tends to dominate the study of intersectionality so this article has helped justify research in more quantitatively oriented fields."

 


Energy

The top cited article between 2001 and 2011 in the field of Energy is:
Allison, J., et.al. Geant4 developments and applications (2006) IEEE Transactions on Nuclear Science, Vol. 53, No. 1, pp. 270-278.
This article was cited 1,450 times (as of July 2014).

Geant4 is a software tool developed by scientists from all over the world. The article boasts 44 authors from various countries including UK, USA, Japan, Switzerland, Italy, Spain and Russia to name a few. Geant4 is a software toolkit for the simulation of the passage of particles through matter. It is used for a large number of experiments and projects in a variety of application domains, including high energy physics, astrophysics and space science, medical physics and radiation protection (22). The article was mostly cited by articles in the field of Physics and Astronomy and Engineering. In addition, a large number of citations were received from the field of Medicine where the toolkit is used to track the effect of materials on the human body (23).

Comments from Prof. Asai:

"“Geant4 developments and applications” is our second general publication followed by "Geant4 - A Simulation Toolkit", J.S. Agostinelli et al., Nuclear Instruments and Methods A, Vol. 506 (2003) 250-303. Geant4 is a software toolkit for simulating elementary particle passing through and interacting with matter. Its areas of application include high energy, nuclear and accelerator physics, as well as studies in medical science, space science and material science, which are rapidly expanding."

 

 

Chemical Engineering

The top cited article between 2001 and 2011 in the field of Chemical Engineering is:
Kreuer, K.D. On the development of proton conducting polymer membranes for hydrogen and methanol fuel cells. (2001) Journal of Membrane Science, Vol. 185, No. 1, pp. 29-39.
This article was cited 1,689 times (as of July 2014).

Proton conducting polymer membranes are of general interest because such membranes can be used to conduct protons in fuel cells, which convert, for example hydrogen or methanol into electrical energy and show promise as low emission power sources. So far, the benchmark membrane material was Nafion, a sulfonated tetrafluoroethylene based fluoropolymer-copolymer discovered in the late 1960s by Walther Grot of DuPont which is not only used in fuel cells, but also in other electrochemical devices, chlor-alkali production, metal-ion recovery, water electrolysis, plating, surface treatment of metals, batteries, sensors, Donnan dialysis cells, drug release, gas drying or humidification, and superacid catalysis for the production of fine chemicals (17). The paper actually reveals structure/property relationships for Nafion and alternative hydrocarbon ionomers, and it presents improved proton conducting polymer membranes (a/k/a polymer electrolyte membranes), along with methods for the manufacture thereof (16). The article even provided visions about membranes conducting protons in the absence of any humidification. Due to the wide range of applications and the need for better membranes, this article was found to be highly cited by Chemistry, Materials Science, Chemical Engineering and Energy.

 

Comment from Prof. Kreuer:

"I am aware of the impact this paper has generated in the community.

This is a pioneering work making, for the first time, a semi-quantitative connection between morphology (microstructure) and transport (proton conductivity, water transport) of fuel cell membranes (hydrocarbon versus PFSA). The disclosed differences provide rationales for explaining many other properties. The materials are highly relevant for fuel cell and other electrochemical applications, and the paper provides clear guidelines for optimizing such materials."

 


Computer Science

The top cited article between 2001 and 2011 in the field of computer science is:
Lowe, D.G.Distinctive image features from scale-invariant keypoints (2004) International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110.
This article was cited 15,797 times (as of July 2014).

The paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene by using object recognition algorithm. The algorithm was published by David Lowe in 1999. Applications of this algorithm include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving. The algorithm is patented in the US; the owner is the University of British Columbia (18). In addition to being highly cited in related disciplines such as Engineering and Mathematics, this article and the method described are also cited by Health, Decision and Social Sciences fields. In Health Sciences the method is used for organ imaging (19), while in Social Sciences it is used to track the processing and interpretation of visual images by humans, to give an example (20). Examining Decision Sciences in the context of this article, the method has been used to study decision processing based on visual recognition, such as street signs (21).

Comments from Prof. Lowe:

"The reasons for the high citations include the fact that it describes a useful algorithm for other researchers in computer vision to match images in a way that wasn't available previously. In addition, the method is very efficient compared to previous approaches, so it is widely used in practice which leads to further citations."

 


Engineering

The top cited article between 2001 and 2011 in the field of Engineering (focusing on Condensed Matter Physics) is:
Geim, A.K., Novoselov, K.S.The rise of graphene (2007) Nature Materials, Vol. 6, No. 3, pp. 183-191.
This article was cited 11,102 times (as of July 2014).

Graphene is pure carbon in the form of a very thin, nearly transparent sheet, one atom thick. It is remarkably strong for its very low weight (100 times stronger than steel) and it conducts heat and electricity with great efficiency. It was first produced in the lab in 2004 (24). This article discusses the nature and uses of Graphene and the emergence of a new paradigm of 'relativistic' condensed matter physics.

Citing articles are from a wide spectrum of sciences including Materials Sciences, Chemistry, Energy, Pharmacology, Computer Science and so forth, in all of which Graphene is used, studied and developed. Graphene is probably a good example of basic research leading to a technological innovation. Thus, examining citations to this article in Social Sciences, one notices that this article is cited by papers describing the global Graphene research front (25), patenting trends (26) and the use of Graphene in technological developments (27) to name a few.

Comment from Prof. Geim:

"This paper should be viewed in combination with our paper “Electric field in atomically thin carbon films” (Science, 2004). Both are equally well cited as laying foundations for graphene research, a Nobel-prize winning subject."

 


Environmental Science

The top cited article between 2001 and 2011 in the field of Environmental Sciences is:
Kolpin, D.W., Furlong, E.T., Meyer, M.T., Thurman, E.M., Zaugg, S.D., Barber, L.B., Buxton, H.T. Pharmaceuticals, hormones, and other organic wastewater contaminants in U.S. streams, 1999-2000: A national reconnaissance (2002) Environmental Science and Technology, Vol. 36, No. 6, pp. 1202-121.
This article was cited 3,279 times (as of July 2014).

The article was written by US Geological Survey researchers who utilized five newly developed analytical methods to measure concentrations of 95 OWCs (organic wastewater contaminants) in water samples from a network of 139 streams across 30 states during 1999 and 2000. This study represented the first national-scale investigation of pharmaceuticals and other OWCs in streams of the U.S. The results of the study demonstrate the prevalence of pharmaceuticals and other OWCs in U.S. streams and the importance of obtaining data on metabolites to fully understand not only the fate and transport of OWCs in the hydrologic system, but also their ultimate overall effect on human health and the environment. As it touches on a wide range of environmental issues, this article is cited by articles in Chemistry, Agriculture, Medicine, Earth Sciences and so forth. However, it is worth noting its citations in law and regulations articles which fall under Social Sciences (28) as well as Economy and Business related articles which look at policy issues related to OWCs (29).

Comments from Mr. Kolpin:

"Yes, I was aware that our ES&T article from 2002 was being highly cited by the scientific community. In fact, this research was noted as the most frequently cited paper in the field of environmental science since 2010 and was prominently used in the article “Top-cited articles in environmental sciences: Merits and demerits of citation analysis” (Khan, M.A. and Ho, Y-S., Sci. Total Environ., v. 431, p. 122-127).

There are probably multiple factors for the number of citations this paper has received, but I think the primary reason is that it has turned out to be a seminal paper on the occurrence of contaminants of emerging concern (CECs) in water resources and was the first national-scale study of such compounds conducted in the United States. If you look at the number of papers published annually on the topic of CECs you can see that since 2002 (the year our paper was published) there has been a continual and dramatic increase in the number of papers being published each year. This increasing trend in CEC papers published annually documents the ever increasing interest by the scientific community in the rapidly evolving topic of CECs. Thus, even though the percentage of papers citing our 2002 ES&T papers may be slowly decreasing with time it is likely being offset by the total number of papers being published on the topic (keeping the number of citations for our 2002 paper at a healthy pace)."

 


Medicine

The top cited article between 2001 and 2011 in the field of Medicine is:
Rossouw, J.E., et.al.  Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the women's health initiative randomized controlled trial (2002) Journal of the American Medical Association, Vol. 288, No. 3, pp. 321-333.
This article was cited 9,723 times (as of July 2014).

The paper assesses the major health benefits and risks of the most commonly used combined hormone preparation estrogen plus progestin in the United States and found that the overall health risks exceeded benefits from use of combined hormone preparation. The study was conducted by a group of scientists from the Division of Women's Health Initiative at the National Heart, Lung/Blood Institute in the USA.

This article is seen to be cited in disciplines other than medicine-related ones, including Social Sciences and Arts & Humanities. Although the article reports on a specific experiment related to drug prescription and its effect on women’s health, it evoked a wider discussion which is seen in studies relating to health policy, women psychology and narratives relating to menopause (30, 31).

Comments from Prof. Rossouw:

"We are aware that this article was and continues to be highly cited. The findings overturned many decades of conventional wisdom, in particular that hormone therapy would prevent cardiovascular disease and that the benefits would outweigh the risks. As a result of this perception of benefit, menopausal hormone therapy was being prescribed to millions of women for chronic disease prevention in addition to its established role in treatment of vasomotor symptoms. After the contrary findings were published, prescriptions for estrogen plus progestin hormone therapy declined by 75% in the first 18 months and have continued to decline. Nationally, breast cancer rates have declined in parallel with hormone prescriptions. In short, the article had a substantial impact on medical practice and on public health."

 


Observations

It is noticeable that 4 out of the 10 articles featured here describe the development of computer software. The practice of citing computer software when used in a study is a part of this phenomenon. Regardless of the subject field, the computational tools developed and written about are highly cited.

Out of the 10 selected articles, 6 are the result of a scientific collaboration between two or more researchers. Collaboration is seen across institutions and countries which could be a result of a common global concern to damaging phenomena related to the environment.

The analysis of citing disciplines shows that research, regardless of its disciplinary origin, crosses subject-specific domains and has impact on a wide range of areas, some of which are quite surprising. It is plausible that the growing ability of researchers to be exposed to and read a wider range of literature encourages the transfer of knowledge from one discipline to another.

 

Subject Article Link
Agricultural and Biological Sciences MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0 http://www.scopus.com/inward/record.url?eid=2-s2.0-54049133744&partnerID=40&md5=1d3cc2d08a900cac9195fc5449e6ff36
Arts and Humanities The complexity of intersectionality http://www.scopus.com/record/display.url?eid=2-s2.0-
23944514914&origin=resultslist&sort=plf-f&cite=2-s2.0-
23944514914&src=s&nlo=&nlr=&nls=&imp=t&sid=0F0EEB0
8EB8678DE6DA47EF4EB047038.I0QkgbIjGqqLQ4Nw7dqZ4A
%3a240&sot=cite&sdt=cl&cluster=scopubyr%2c%222014%
22%2ct&sl=0
Biochemistry, Genetics and Molecular Biology Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method http://www.scopus.com/inward/record.url?eid=2-s2.0-0035710746&partnerID=40&md5=1989d15012db1b7616667232e06bbf50
Business, Management and Accounting User acceptance of information technology: Toward a unified view http://www.scopus.com/inward/record.url?eid=2-s2.0-1542382496&partnerID=40&md5=c635d7fd45a06a546dade8aea290c639
Chemical Engineering Processable aqueous dispersions of graphene nanosheets http://www.scopus.com/inward/record.url?eid=2-s2.0-38949108623&partnerID=40&md5=1f43c215908152f166755a05363f233c
Chemistry UCSF Chimera - A visualization system for exploratory research and analysis http://www.scopus.com/inward/record.url?eid=2-s2.0-4444221565&partnerID=40&md5=c9a4f4d426be1828e82f0f8e84537387
Computer Science Distinctive image features from scale-invariant keypoints http://www.scopus.com/inward/record.url?eid=2-s2.0-3042535216&partnerID=40&md5=28d20d21e532843d1243c5120505043a
Decision Sciences To parcel or not to parcel: Exploring the question, weighing the merits http://www.scopus.com/inward/record.url?eid=2-s2.0-0001378820&partnerID=40&md5=50b37bfa7ca10235aa008539bee136fb
Earth and Planetary Sciences First-year Wilkinson Microwave Anisotropy Probe (WMAP) observations: Determination of cosmological parameters http://www.scopus.com/inward/record.url?eid=2-s2.0-17044381941&partnerID=40&md5=36cf9cb4ba795948e7331117aa3096f2
Economics, Econometrics and Finance Evolving to a New Dominant Logic for Marketing http://www.scopus.com/inward/record.url?eid=2-s2.0-1642587247&partnerID=40&md5=12f7d97c9f3f71c84369a18c44c2220e
Energy Geant4 developments and applications http://www.scopus.com/inward/record.url?eid=2-s2.0-33645696556&partnerID=40&md5=a5da91aed48b47270d579a3170e32b4c
Engineering The rise of graphene http://www.scopus.com/inward/record.url?eid=2-s2.0-33847690144&partnerID=40&md5=e7a10d1aae647a18ece362fa0c639319
Environmental Science Pharmaceuticals, hormones, and other organic wastewater contaminants in U.S. streams, 1999-2000: A national reconnaissance http://www.scopus.com/inward/record.url?eid=2-s2.0-0037085574&partnerID=40&md5=f0076a6d031995fc6468f66c7f172916
Immunology and Microbiology Improved prediction of signal peptides: SignalP 3.0 http://www.scopus.com/inward/record.url?eid=2-s2.0-3042521098&partnerID=40&md5=3e66f800ebc7630ff24f0b95467be33c
Materials Science The SIESTA method for ab initio order-N materials simulation http://www.scopus.com/inward/record.url?eid=2-s2.0-0037171091&partnerID=40&md5=521af3b42a3e8b8fc508c10c473d609b
Mathematics A fast and elitist multiobjective genetic algorithm: NSGA-II http://www.scopus.com/inward/record.url?eid=2-s2.0-0036530772&partnerID=40&md5=174c7328a283b2aaa5c3f7c2b7b900ae
Medicine Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the women's health initiative randomized controlled trial http://www.scopus.com/inward/record.url?eid=2-s2.0-0037125379&partnerID=40&md5=b20cf8258a09c26d78c48fc72cee6097
Neuroscience Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain http://www.scopus.com/inward/record.url?eid=2-s2.0-0036322886&partnerID=40&md5=e0c279770e722b228efd25fbcd86edbf
Pharmacology, Toxicology and Pharmaceutics Minimal criteria for defining multipotent mesenchymal stromal cells. The International Society for Cellular Therapy position statement http://www.scopus.com/inward/record.url?eid=2-s2.0-33747713246&partnerID=40&md5=931d063ca5e127676440830aedb7c972
Physics and Astronomy Statistical mechanics of complex networks http://www.scopus.com/inward/record.url?eid=2-s2.0-0036013593&partnerID=40&md5=19a1f060a576b614317e1f93740253d5
Psychology Using thematic analysis in psychology http://www.scopus.com/inward/record.url?eid=2-s2.0-33750505977&partnerID=40&md5=949c9a8170016855a4e4f5179927fd43
Social Sciences User acceptance of information technology: Toward a unified view http://www.scopus.com/inward/record.url?eid=2-s2.0-1542382496&partnerID=40&md5=c635d7fd45a06a546dade8aea290c639
Veterinary Reproductive Loss in high-producing dairy cattle: Where will it end (ADSA foundation scholar award) http://www.scopus.com/inward/record.url?eid=2-s2.0-0035379705&partnerID=40&md5=a312e535ebe24cd87f608f5606ba4230
Dentistry Stem cell properties of human dental pulp stem cells http://www.scopus.com/inward/record.url?eid=2-s2.0-0036704390&partnerID=40&md5=06d9d6cefdf5303e46583a04134c30e0

Table 1 - Full List of Top Cited Articles in Scopus (Data Collected July 2014)

 

References

(1) http://www.megasoftware.net/
(2) http://en.wikipedia.org/wiki/MEGA,_Molecular_Evolutionary_Genetics_Analysis.
(3) Bon, C., Berthonaud, V., Fosse, P., Gély, B., Maksud, F., Vitalis, R., Philippe, M., van der Plicht, J., Elalouf, J.-M. (2011) “Low regional diversity of late cave bears mitochondrial DNA at the time of Chauvet Aurignacian paintings”.Journal of Archaeological Science, Vol. 38, No.8, pp. 1886-1895. Retrieved from www.scopus.com
(4) Oliveira, H.R., Civáň, P., Morales, J., Rodríguez-Rodríguez, A., Lister, D.L., & Jones, M.K. (2012) “Ancient DNA in archaeological wheat grains: Preservation conditions and the study of pre-Hispanic agriculture on the island of Gran Canaria (Spain)”,Journal of Archaeological Science, Vol. 39, No.4, pp. 828-835. Retrieved from www.scopus.com
(5) Eric, W. (2010) “Do languages originate and become extinct at constant rates?”,Diachronica, Vol. 27, No.2, pp. 214-225.
(6) http://en.wikipedia.org/wiki/Intersectionality
(7) Jyrkinen, M. (2014) “Women managers, careers and gendered ageism”, Scandinavian Journal of Management, Vol. 30, No.2, pp. 175-185.
(8) Kokot, P. (2014) “Structures and relationships: Women partners' careers in Germany and the UK”, Accounting, Auditing and Accountability Journal, Vol. 27, No.1, pp. 48-72.
(9) Levac, L. (2013) “'Is this for real?' participatory research, intersectionality, and the development of leader and collective efficacy with young mothers”, Action Research, Vol. 11, No. 4, pp. 423-441.
(10) Carr, E.R., & Thompson, M.C. (2014) “Gender and climate change adaptation in agrarian settings: Current thinking, new directions, and research frontiers”, Geography Compass, Vol. 8, No. 3, pp. 182-197.
(11) Júlíusdóttir, M., Skaptadóttir, U.D., & Karlsdóttir, A. (2013) “Gendered migration in turbulent times in Iceland”, Norsk Geografisk Tidsskrift, Vol. 67, No.5, 266-275.
(12) Zhou, J., Rau, P.-L.P., & Salvendy, G. (2014) “Older adults’ use of smart phones: An investigation of the factors influencing the acceptance of new functions”,Behaviour and Information Technology, Vol. 33, No.6, pp. 552-560.
(13) Liew, E.J.Y., Vaithilingam, S., & Nair, M. (2014) “Facebook and socio-economic benefits in the developing world”, Behaviour and Information Technology, Vol. 33, No.4, pp. 345-360.
(14) Saleh, A.M., Haris, A., & Ahmad, N. (2014) “Towards a UTAUT-based model for the intention to use solar water heaters by Libyan households”, International Journal of Energy Economics and Policy,Vol. 4, No.1, 26-31.
(15) Abdekhoda, M., Ahmadi, M., Dehnad, A., & Hosseini, A.F. (2014) “Information technology acceptance in health information management”, Methods of Information in Medicine, Vol. 53, No.1, pp. 14-20.
(16) http://www.research.psu.edu/patents/technologies/2127
(17) http://en.wikipedia.org/wiki/Nafion#Applications
(18) http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
(19) Wu, H., Wang, D., Shi, L., Wen, Z., & Ming, Z. (2014) “Midsagittal plane extraction from brain images based on 3D SIFT”,Physics in Medicine and Biology, Vol. 59, No.6, pp. 1367-1387
(20) Laeng, B., Bloem, I.M., D'Ascenzo, S., & Tommasi, L. (2014) “Scrutinizing visual images: The role of gaze in mental imagery and memory”, Cognition, Vol. 131, No.2, pp. 263-283.
(21) Liu, H., Liu, Y., & Sun, F. (2014) “Traffic sign recognition using group sparse coding”,Information Sciences, Vol. 266, pp. 75-89.
(22) http://geant4.cern.ch/
(23) Tendeiro, D., Lopes, G., Vieira, P., & Santos, J.P. (2014) “Monte Carlo simulation of laser beams interaction with the human eye using Geant4”,BioMedical Engineering Online, Vol. 13, No. 1.
(24) http://en.wikipedia.org/wiki/Graphene
(25) Arora, S.K., Youtie, J., Shapira, P., Gao, L., & Ma, T.T. (2013) “Entry strategies in an emerging technology: A pilot web-based study of graphene firms”, Scientometrics, Vol. 95, No.3, pp. 1189-1207.
(26) Lehman, K. (2011) “Reviews of science for science librarians: Graphene”,Science and Technology Libraries, Vol. 30, No. 2, pp. 132-142.
(27) Pham, C.H., & Fayerberg, R. (2011) “Current trends in patenting graphene and graphene-based inventions”,Nanotechnology Law and Business, Vol. 8, No.1, pp. 10-17.
(28) Brands, E. (2014) “Siting restrictions and proximity of concentrated animal feeding operations to surface water”,Environmental Science and Policy,Vol. 38, pp. 245-253.
(29) Halden, R.U. (2014) “On the need and speed of regulating triclosan and triclocarban in the United States”,Environmental Science and Technology,Vol. 48, No.7, pp. 3603-3611.
(30) Bird, C.E. (2014) “Will extending the women's health initiative lead to better research and policy?”, Women's Health Issues,Vol. 24, No. 1, e3-e4.
(31) Nosek, M., Kennedy, H.P., & Gudmundsdottir, M. (2012) “Distress during the menopause transition: A rich contextual analysis of midlife women's narratives”, SAGE Open, Vol. 2, No.3, pp. 1-10.
(32) Wilson, S.M., DeMarco, A.T., Henry, M.L., Gesierich, B., Babiak, M., Mandelli, M.L., Miller, B.L., Gorno-Tempini, M.L. (2014) “What role does the anterior temporal lobe play in sentence-level processing? Neural correlates of syntactic processing in semantic variant primary progressive aphasia”, Journal of Cognitive Neuroscience,Vol. 26, No.5, pp. 970-985.
(33) van der Laan, L.N., de Ridder, D.T.D., Charbonnier, L., Viergever, M.A., & Smeets, P.A.M. (2014) “Sweet lies: neural, visual, and behavioral measures reveal a lack of self-control conflict during food choice in weight-concerned women”, Frontiers in Behavioral Neuroscience,Vol. 8 (MAY).
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Scopus is celebrating 10 years since its launch. As the largest abstract and citation database of peer-reviewed literature available today, Scopus boasts 53 million records, 21,915 titles from 5,000 publishers. In this paper we aimed to identify some of the top cited papers indexed in Scopus across various disciplines between 2001 and 2011. In addition, we contacted the authors of these papers to seek their insight about why they think their papers are as highly cited as they are.

In order to achieve this, we conducted a comprehensive search on all Scopus data, limiting the results to articles published between 2001 and 2011. Scopus is the largest abstract and citation database of peer-reviewed literature, and features smart tools to track, analyze and visualize research. The initial search results yielded more than 13 million records (as of June 11, 2014). This set was further refined, to include only full research articles while excluding reviews, editorials or book chapters. The search results were then limited to one of Scopus' 26 subject categories at a time (see Table 1 for full list). Each set of articles under a subject category was sorted by “cited by” counts (i.e. citations), which enables the highly cited articles to be identified.

In this paper we review the following 8 subject areas and their top cited articles:

  • Agricultural and Biological Sciences
  • Arts and Humanities
  • Computer Science
  • Chemical Engineering
  • Energy
  • Engineering
  • Environmental Science
  • Medicine

 

Agricultural and Biological Sciences

The top cited article between 2001 and 2011 in Agricultural and Biological Sciences is:
Tamura, K., Dudley, J., Nei, M., Kumar, S.MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0 (2007) Molecular Biology and Evolution, Vol. 24, No. 8, pp. 1596-1599.
Cited 17,359 times (as of June, 2014).

MEGA [Molecular Evolutionary Genetics Analysis] is a freely available software tool for conducting statistical analysis of molecular evolution and for constructing phylogenetic trees. MEGA is used by biologists in a large number of laboratories for reconstructing the evolutionary histories of species and inferring the extent and nature of the selective forces shaping the evolution of genes and species (1). This software was first developed by Sudhir Kumar and Koichiro Tamura in the laboratory of Dr. Masatoshi Nei (2). The first version of this software was released in 1993. As expected, the main disciplines citing this article are Agricultural and Biological Sciences, Biochemistry, Genetics and Molecular Biology, Immunology, Medicine and Veterinary Sciences. However, there are several interesting disciplines citing this software including Social Sciences, Arts and Humanities and Business, which may not seem directly related to the core research field of this software. A closer look at these citing disciplines reveals that the software has been used to track Ancient DNA in Anthropology and Archeology and to sketch the markup of civilization (3, 4) as well as study the phenomenon of the emergence and extinction of languages (5).

Comments from Prof. Kumar:

"This article described a useful software tool that enables comparative analysis of DNA and protein sequences from different individuals, strains, and species. Such analyses are becoming very important in this age of genomics, and increasingly larger numbers of scientists are using MEGA software to analyze their data."

Comments from Prof. Nei:

"MEGA4 is the fourth version of the MEGA, and in this version a new Maximum Composite Likelihood method of estimating evolutionary distances and other evolutionary parameters have been introduced. It has also been made usable in Linux and Intel-based Macintosh computers. Because of these new features, the MEGA4 article has been cited a large number of times. This improvement of the software was done primarily by Koichiro Tamura and Sudhir Kumar. Further improvement of the software was published later in the MEGA5 (2011) and MEGA6 (2013) articles."

 


Arts & Humanities

The top cited article between 2001 and 2011 in Arts & Humanities is:
McCall, L. The complexity of intersectionality (2005) Signs, Vol. 30, No. 3, pp. 1771-1800.
This article was cited 640 times (as of July, 2014).

This article discusses the complexity of studying the issue of intersectionality and offers different methods to do so. Intersectionality (or intersectionalism) is the study of intersections between forms or systems of oppression, domination or discrimination (6). The article was written by Leslie McCall, a professor at Northwestern University whose main areas of research include social inequality, economic and political sociology, methods, and social theory. This article is highly cited by research papers in Arts & Humanities and Social Sciences in the context of gender-related psychology, ethnic identity and feminism. Yet, it is also cited by Business and Management research focusing on women’s careers in business (7), workplace diversity (8) and women’s leadership skills development (9). Another interesting discipline citing this paper is Environmental Sciences, which refers to it in the context of gender-related client change adaptation (10) and gender migration patterns (11), to name two examples.

 Comments from Dr. McCall:

"I believe [the high citation count] has to do with interdisciplinary interest in the issue of intersectionality across a wide range of fields. I try to extend the usefulness of the concept for quantitative as well as qualitative research. The latter tends to dominate the study of intersectionality so this article has helped justify research in more quantitatively oriented fields."

 


Energy

The top cited article between 2001 and 2011 in the field of Energy is:
Allison, J., et.al. Geant4 developments and applications (2006) IEEE Transactions on Nuclear Science, Vol. 53, No. 1, pp. 270-278.
This article was cited 1,450 times (as of July 2014).

Geant4 is a software tool developed by scientists from all over the world. The article boasts 44 authors from various countries including UK, USA, Japan, Switzerland, Italy, Spain and Russia to name a few. Geant4 is a software toolkit for the simulation of the passage of particles through matter. It is used for a large number of experiments and projects in a variety of application domains, including high energy physics, astrophysics and space science, medical physics and radiation protection (22). The article was mostly cited by articles in the field of Physics and Astronomy and Engineering. In addition, a large number of citations were received from the field of Medicine where the toolkit is used to track the effect of materials on the human body (23).

Comments from Prof. Asai:

"“Geant4 developments and applications” is our second general publication followed by "Geant4 - A Simulation Toolkit", J.S. Agostinelli et al., Nuclear Instruments and Methods A, Vol. 506 (2003) 250-303. Geant4 is a software toolkit for simulating elementary particle passing through and interacting with matter. Its areas of application include high energy, nuclear and accelerator physics, as well as studies in medical science, space science and material science, which are rapidly expanding."

 

 

Chemical Engineering

The top cited article between 2001 and 2011 in the field of Chemical Engineering is:
Kreuer, K.D. On the development of proton conducting polymer membranes for hydrogen and methanol fuel cells. (2001) Journal of Membrane Science, Vol. 185, No. 1, pp. 29-39.
This article was cited 1,689 times (as of July 2014).

Proton conducting polymer membranes are of general interest because such membranes can be used to conduct protons in fuel cells, which convert, for example hydrogen or methanol into electrical energy and show promise as low emission power sources. So far, the benchmark membrane material was Nafion, a sulfonated tetrafluoroethylene based fluoropolymer-copolymer discovered in the late 1960s by Walther Grot of DuPont which is not only used in fuel cells, but also in other electrochemical devices, chlor-alkali production, metal-ion recovery, water electrolysis, plating, surface treatment of metals, batteries, sensors, Donnan dialysis cells, drug release, gas drying or humidification, and superacid catalysis for the production of fine chemicals (17). The paper actually reveals structure/property relationships for Nafion and alternative hydrocarbon ionomers, and it presents improved proton conducting polymer membranes (a/k/a polymer electrolyte membranes), along with methods for the manufacture thereof (16). The article even provided visions about membranes conducting protons in the absence of any humidification. Due to the wide range of applications and the need for better membranes, this article was found to be highly cited by Chemistry, Materials Science, Chemical Engineering and Energy.

 

Comment from Prof. Kreuer:

"I am aware of the impact this paper has generated in the community.

This is a pioneering work making, for the first time, a semi-quantitative connection between morphology (microstructure) and transport (proton conductivity, water transport) of fuel cell membranes (hydrocarbon versus PFSA). The disclosed differences provide rationales for explaining many other properties. The materials are highly relevant for fuel cell and other electrochemical applications, and the paper provides clear guidelines for optimizing such materials."

 


Computer Science

The top cited article between 2001 and 2011 in the field of computer science is:
Lowe, D.G.Distinctive image features from scale-invariant keypoints (2004) International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110.
This article was cited 15,797 times (as of July 2014).

The paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene by using object recognition algorithm. The algorithm was published by David Lowe in 1999. Applications of this algorithm include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving. The algorithm is patented in the US; the owner is the University of British Columbia (18). In addition to being highly cited in related disciplines such as Engineering and Mathematics, this article and the method described are also cited by Health, Decision and Social Sciences fields. In Health Sciences the method is used for organ imaging (19), while in Social Sciences it is used to track the processing and interpretation of visual images by humans, to give an example (20). Examining Decision Sciences in the context of this article, the method has been used to study decision processing based on visual recognition, such as street signs (21).

Comments from Prof. Lowe:

"The reasons for the high citations include the fact that it describes a useful algorithm for other researchers in computer vision to match images in a way that wasn't available previously. In addition, the method is very efficient compared to previous approaches, so it is widely used in practice which leads to further citations."

 


Engineering

The top cited article between 2001 and 2011 in the field of Engineering (focusing on Condensed Matter Physics) is:
Geim, A.K., Novoselov, K.S.The rise of graphene (2007) Nature Materials, Vol. 6, No. 3, pp. 183-191.
This article was cited 11,102 times (as of July 2014).

Graphene is pure carbon in the form of a very thin, nearly transparent sheet, one atom thick. It is remarkably strong for its very low weight (100 times stronger than steel) and it conducts heat and electricity with great efficiency. It was first produced in the lab in 2004 (24). This article discusses the nature and uses of Graphene and the emergence of a new paradigm of 'relativistic' condensed matter physics.

Citing articles are from a wide spectrum of sciences including Materials Sciences, Chemistry, Energy, Pharmacology, Computer Science and so forth, in all of which Graphene is used, studied and developed. Graphene is probably a good example of basic research leading to a technological innovation. Thus, examining citations to this article in Social Sciences, one notices that this article is cited by papers describing the global Graphene research front (25), patenting trends (26) and the use of Graphene in technological developments (27) to name a few.

Comment from Prof. Geim:

"This paper should be viewed in combination with our paper “Electric field in atomically thin carbon films” (Science, 2004). Both are equally well cited as laying foundations for graphene research, a Nobel-prize winning subject."

 


Environmental Science

The top cited article between 2001 and 2011 in the field of Environmental Sciences is:
Kolpin, D.W., Furlong, E.T., Meyer, M.T., Thurman, E.M., Zaugg, S.D., Barber, L.B., Buxton, H.T. Pharmaceuticals, hormones, and other organic wastewater contaminants in U.S. streams, 1999-2000: A national reconnaissance (2002) Environmental Science and Technology, Vol. 36, No. 6, pp. 1202-121.
This article was cited 3,279 times (as of July 2014).

The article was written by US Geological Survey researchers who utilized five newly developed analytical methods to measure concentrations of 95 OWCs (organic wastewater contaminants) in water samples from a network of 139 streams across 30 states during 1999 and 2000. This study represented the first national-scale investigation of pharmaceuticals and other OWCs in streams of the U.S. The results of the study demonstrate the prevalence of pharmaceuticals and other OWCs in U.S. streams and the importance of obtaining data on metabolites to fully understand not only the fate and transport of OWCs in the hydrologic system, but also their ultimate overall effect on human health and the environment. As it touches on a wide range of environmental issues, this article is cited by articles in Chemistry, Agriculture, Medicine, Earth Sciences and so forth. However, it is worth noting its citations in law and regulations articles which fall under Social Sciences (28) as well as Economy and Business related articles which look at policy issues related to OWCs (29).

Comments from Mr. Kolpin:

"Yes, I was aware that our ES&T article from 2002 was being highly cited by the scientific community. In fact, this research was noted as the most frequently cited paper in the field of environmental science since 2010 and was prominently used in the article “Top-cited articles in environmental sciences: Merits and demerits of citation analysis” (Khan, M.A. and Ho, Y-S., Sci. Total Environ., v. 431, p. 122-127).

There are probably multiple factors for the number of citations this paper has received, but I think the primary reason is that it has turned out to be a seminal paper on the occurrence of contaminants of emerging concern (CECs) in water resources and was the first national-scale study of such compounds conducted in the United States. If you look at the number of papers published annually on the topic of CECs you can see that since 2002 (the year our paper was published) there has been a continual and dramatic increase in the number of papers being published each year. This increasing trend in CEC papers published annually documents the ever increasing interest by the scientific community in the rapidly evolving topic of CECs. Thus, even though the percentage of papers citing our 2002 ES&T papers may be slowly decreasing with time it is likely being offset by the total number of papers being published on the topic (keeping the number of citations for our 2002 paper at a healthy pace)."

 


Medicine

The top cited article between 2001 and 2011 in the field of Medicine is:
Rossouw, J.E., et.al.  Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the women's health initiative randomized controlled trial (2002) Journal of the American Medical Association, Vol. 288, No. 3, pp. 321-333.
This article was cited 9,723 times (as of July 2014).

The paper assesses the major health benefits and risks of the most commonly used combined hormone preparation estrogen plus progestin in the United States and found that the overall health risks exceeded benefits from use of combined hormone preparation. The study was conducted by a group of scientists from the Division of Women's Health Initiative at the National Heart, Lung/Blood Institute in the USA.

This article is seen to be cited in disciplines other than medicine-related ones, including Social Sciences and Arts & Humanities. Although the article reports on a specific experiment related to drug prescription and its effect on women’s health, it evoked a wider discussion which is seen in studies relating to health policy, women psychology and narratives relating to menopause (30, 31).

Comments from Prof. Rossouw:

"We are aware that this article was and continues to be highly cited. The findings overturned many decades of conventional wisdom, in particular that hormone therapy would prevent cardiovascular disease and that the benefits would outweigh the risks. As a result of this perception of benefit, menopausal hormone therapy was being prescribed to millions of women for chronic disease prevention in addition to its established role in treatment of vasomotor symptoms. After the contrary findings were published, prescriptions for estrogen plus progestin hormone therapy declined by 75% in the first 18 months and have continued to decline. Nationally, breast cancer rates have declined in parallel with hormone prescriptions. In short, the article had a substantial impact on medical practice and on public health."

 


Observations

It is noticeable that 4 out of the 10 articles featured here describe the development of computer software. The practice of citing computer software when used in a study is a part of this phenomenon. Regardless of the subject field, the computational tools developed and written about are highly cited.

Out of the 10 selected articles, 6 are the result of a scientific collaboration between two or more researchers. Collaboration is seen across institutions and countries which could be a result of a common global concern to damaging phenomena related to the environment.

The analysis of citing disciplines shows that research, regardless of its disciplinary origin, crosses subject-specific domains and has impact on a wide range of areas, some of which are quite surprising. It is plausible that the growing ability of researchers to be exposed to and read a wider range of literature encourages the transfer of knowledge from one discipline to another.

 

Subject Article Link
Agricultural and Biological Sciences MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0 http://www.scopus.com/inward/record.url?eid=2-s2.0-54049133744&partnerID=40&md5=1d3cc2d08a900cac9195fc5449e6ff36
Arts and Humanities The complexity of intersectionality http://www.scopus.com/record/display.url?eid=2-s2.0-
23944514914&origin=resultslist&sort=plf-f&cite=2-s2.0-
23944514914&src=s&nlo=&nlr=&nls=&imp=t&sid=0F0EEB0
8EB8678DE6DA47EF4EB047038.I0QkgbIjGqqLQ4Nw7dqZ4A
%3a240&sot=cite&sdt=cl&cluster=scopubyr%2c%222014%
22%2ct&sl=0
Biochemistry, Genetics and Molecular Biology Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method http://www.scopus.com/inward/record.url?eid=2-s2.0-0035710746&partnerID=40&md5=1989d15012db1b7616667232e06bbf50
Business, Management and Accounting User acceptance of information technology: Toward a unified view http://www.scopus.com/inward/record.url?eid=2-s2.0-1542382496&partnerID=40&md5=c635d7fd45a06a546dade8aea290c639
Chemical Engineering Processable aqueous dispersions of graphene nanosheets http://www.scopus.com/inward/record.url?eid=2-s2.0-38949108623&partnerID=40&md5=1f43c215908152f166755a05363f233c
Chemistry UCSF Chimera - A visualization system for exploratory research and analysis http://www.scopus.com/inward/record.url?eid=2-s2.0-4444221565&partnerID=40&md5=c9a4f4d426be1828e82f0f8e84537387
Computer Science Distinctive image features from scale-invariant keypoints http://www.scopus.com/inward/record.url?eid=2-s2.0-3042535216&partnerID=40&md5=28d20d21e532843d1243c5120505043a
Decision Sciences To parcel or not to parcel: Exploring the question, weighing the merits http://www.scopus.com/inward/record.url?eid=2-s2.0-0001378820&partnerID=40&md5=50b37bfa7ca10235aa008539bee136fb
Earth and Planetary Sciences First-year Wilkinson Microwave Anisotropy Probe (WMAP) observations: Determination of cosmological parameters http://www.scopus.com/inward/record.url?eid=2-s2.0-17044381941&partnerID=40&md5=36cf9cb4ba795948e7331117aa3096f2
Economics, Econometrics and Finance Evolving to a New Dominant Logic for Marketing http://www.scopus.com/inward/record.url?eid=2-s2.0-1642587247&partnerID=40&md5=12f7d97c9f3f71c84369a18c44c2220e
Energy Geant4 developments and applications http://www.scopus.com/inward/record.url?eid=2-s2.0-33645696556&partnerID=40&md5=a5da91aed48b47270d579a3170e32b4c
Engineering The rise of graphene http://www.scopus.com/inward/record.url?eid=2-s2.0-33847690144&partnerID=40&md5=e7a10d1aae647a18ece362fa0c639319
Environmental Science Pharmaceuticals, hormones, and other organic wastewater contaminants in U.S. streams, 1999-2000: A national reconnaissance http://www.scopus.com/inward/record.url?eid=2-s2.0-0037085574&partnerID=40&md5=f0076a6d031995fc6468f66c7f172916
Immunology and Microbiology Improved prediction of signal peptides: SignalP 3.0 http://www.scopus.com/inward/record.url?eid=2-s2.0-3042521098&partnerID=40&md5=3e66f800ebc7630ff24f0b95467be33c
Materials Science The SIESTA method for ab initio order-N materials simulation http://www.scopus.com/inward/record.url?eid=2-s2.0-0037171091&partnerID=40&md5=521af3b42a3e8b8fc508c10c473d609b
Mathematics A fast and elitist multiobjective genetic algorithm: NSGA-II http://www.scopus.com/inward/record.url?eid=2-s2.0-0036530772&partnerID=40&md5=174c7328a283b2aaa5c3f7c2b7b900ae
Medicine Risks and benefits of estrogen plus progestin in healthy postmenopausal women: Principal results from the women's health initiative randomized controlled trial http://www.scopus.com/inward/record.url?eid=2-s2.0-0037125379&partnerID=40&md5=b20cf8258a09c26d78c48fc72cee6097
Neuroscience Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain http://www.scopus.com/inward/record.url?eid=2-s2.0-0036322886&partnerID=40&md5=e0c279770e722b228efd25fbcd86edbf
Pharmacology, Toxicology and Pharmaceutics Minimal criteria for defining multipotent mesenchymal stromal cells. The International Society for Cellular Therapy position statement http://www.scopus.com/inward/record.url?eid=2-s2.0-33747713246&partnerID=40&md5=931d063ca5e127676440830aedb7c972
Physics and Astronomy Statistical mechanics of complex networks http://www.scopus.com/inward/record.url?eid=2-s2.0-0036013593&partnerID=40&md5=19a1f060a576b614317e1f93740253d5
Psychology Using thematic analysis in psychology http://www.scopus.com/inward/record.url?eid=2-s2.0-33750505977&partnerID=40&md5=949c9a8170016855a4e4f5179927fd43
Social Sciences User acceptance of information technology: Toward a unified view http://www.scopus.com/inward/record.url?eid=2-s2.0-1542382496&partnerID=40&md5=c635d7fd45a06a546dade8aea290c639
Veterinary Reproductive Loss in high-producing dairy cattle: Where will it end (ADSA foundation scholar award) http://www.scopus.com/inward/record.url?eid=2-s2.0-0035379705&partnerID=40&md5=a312e535ebe24cd87f608f5606ba4230
Dentistry Stem cell properties of human dental pulp stem cells http://www.scopus.com/inward/record.url?eid=2-s2.0-0036704390&partnerID=40&md5=06d9d6cefdf5303e46583a04134c30e0

Table 1 - Full List of Top Cited Articles in Scopus (Data Collected July 2014)

 

References

(1) http://www.megasoftware.net/
(2) http://en.wikipedia.org/wiki/MEGA,_Molecular_Evolutionary_Genetics_Analysis.
(3) Bon, C., Berthonaud, V., Fosse, P., Gély, B., Maksud, F., Vitalis, R., Philippe, M., van der Plicht, J., Elalouf, J.-M. (2011) “Low regional diversity of late cave bears mitochondrial DNA at the time of Chauvet Aurignacian paintings”.Journal of Archaeological Science, Vol. 38, No.8, pp. 1886-1895. Retrieved from www.scopus.com
(4) Oliveira, H.R., Civáň, P., Morales, J., Rodríguez-Rodríguez, A., Lister, D.L., & Jones, M.K. (2012) “Ancient DNA in archaeological wheat grains: Preservation conditions and the study of pre-Hispanic agriculture on the island of Gran Canaria (Spain)”,Journal of Archaeological Science, Vol. 39, No.4, pp. 828-835. Retrieved from www.scopus.com
(5) Eric, W. (2010) “Do languages originate and become extinct at constant rates?”,Diachronica, Vol. 27, No.2, pp. 214-225.
(6) http://en.wikipedia.org/wiki/Intersectionality
(7) Jyrkinen, M. (2014) “Women managers, careers and gendered ageism”, Scandinavian Journal of Management, Vol. 30, No.2, pp. 175-185.
(8) Kokot, P. (2014) “Structures and relationships: Women partners' careers in Germany and the UK”, Accounting, Auditing and Accountability Journal, Vol. 27, No.1, pp. 48-72.
(9) Levac, L. (2013) “'Is this for real?' participatory research, intersectionality, and the development of leader and collective efficacy with young mothers”, Action Research, Vol. 11, No. 4, pp. 423-441.
(10) Carr, E.R., & Thompson, M.C. (2014) “Gender and climate change adaptation in agrarian settings: Current thinking, new directions, and research frontiers”, Geography Compass, Vol. 8, No. 3, pp. 182-197.
(11) Júlíusdóttir, M., Skaptadóttir, U.D., & Karlsdóttir, A. (2013) “Gendered migration in turbulent times in Iceland”, Norsk Geografisk Tidsskrift, Vol. 67, No.5, 266-275.
(12) Zhou, J., Rau, P.-L.P., & Salvendy, G. (2014) “Older adults’ use of smart phones: An investigation of the factors influencing the acceptance of new functions”,Behaviour and Information Technology, Vol. 33, No.6, pp. 552-560.
(13) Liew, E.J.Y., Vaithilingam, S., & Nair, M. (2014) “Facebook and socio-economic benefits in the developing world”, Behaviour and Information Technology, Vol. 33, No.4, pp. 345-360.
(14) Saleh, A.M., Haris, A., & Ahmad, N. (2014) “Towards a UTAUT-based model for the intention to use solar water heaters by Libyan households”, International Journal of Energy Economics and Policy,Vol. 4, No.1, 26-31.
(15) Abdekhoda, M., Ahmadi, M., Dehnad, A., & Hosseini, A.F. (2014) “Information technology acceptance in health information management”, Methods of Information in Medicine, Vol. 53, No.1, pp. 14-20.
(16) http://www.research.psu.edu/patents/technologies/2127
(17) http://en.wikipedia.org/wiki/Nafion#Applications
(18) http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
(19) Wu, H., Wang, D., Shi, L., Wen, Z., & Ming, Z. (2014) “Midsagittal plane extraction from brain images based on 3D SIFT”,Physics in Medicine and Biology, Vol. 59, No.6, pp. 1367-1387
(20) Laeng, B., Bloem, I.M., D'Ascenzo, S., & Tommasi, L. (2014) “Scrutinizing visual images: The role of gaze in mental imagery and memory”, Cognition, Vol. 131, No.2, pp. 263-283.
(21) Liu, H., Liu, Y., & Sun, F. (2014) “Traffic sign recognition using group sparse coding”,Information Sciences, Vol. 266, pp. 75-89.
(22) http://geant4.cern.ch/
(23) Tendeiro, D., Lopes, G., Vieira, P., & Santos, J.P. (2014) “Monte Carlo simulation of laser beams interaction with the human eye using Geant4”,BioMedical Engineering Online, Vol. 13, No. 1.
(24) http://en.wikipedia.org/wiki/Graphene
(25) Arora, S.K., Youtie, J., Shapira, P., Gao, L., & Ma, T.T. (2013) “Entry strategies in an emerging technology: A pilot web-based study of graphene firms”, Scientometrics, Vol. 95, No.3, pp. 1189-1207.
(26) Lehman, K. (2011) “Reviews of science for science librarians: Graphene”,Science and Technology Libraries, Vol. 30, No. 2, pp. 132-142.
(27) Pham, C.H., & Fayerberg, R. (2011) “Current trends in patenting graphene and graphene-based inventions”,Nanotechnology Law and Business, Vol. 8, No.1, pp. 10-17.
(28) Brands, E. (2014) “Siting restrictions and proximity of concentrated animal feeding operations to surface water”,Environmental Science and Policy,Vol. 38, pp. 245-253.
(29) Halden, R.U. (2014) “On the need and speed of regulating triclosan and triclocarban in the United States”,Environmental Science and Technology,Vol. 48, No.7, pp. 3603-3611.
(30) Bird, C.E. (2014) “Will extending the women's health initiative lead to better research and policy?”, Women's Health Issues,Vol. 24, No. 1, e3-e4.
(31) Nosek, M., Kennedy, H.P., & Gudmundsdottir, M. (2012) “Distress during the menopause transition: A rich contextual analysis of midlife women's narratives”, SAGE Open, Vol. 2, No.3, pp. 1-10.
(32) Wilson, S.M., DeMarco, A.T., Henry, M.L., Gesierich, B., Babiak, M., Mandelli, M.L., Miller, B.L., Gorno-Tempini, M.L. (2014) “What role does the anterior temporal lobe play in sentence-level processing? Neural correlates of syntactic processing in semantic variant primary progressive aphasia”, Journal of Cognitive Neuroscience,Vol. 26, No.5, pp. 970-985.
(33) van der Laan, L.N., de Ridder, D.T.D., Charbonnier, L., Viergever, M.A., & Smeets, P.A.M. (2014) “Sweet lies: neural, visual, and behavioral measures reveal a lack of self-control conflict during food choice in weight-concerned women”, Frontiers in Behavioral Neuroscience,Vol. 8 (MAY).
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Evaluating the individual researcher – adding an altmetric perspective

Judit Bar-Ilan presents a brief introduction to researcher evaluation using portfolios and discusses how altmetrics can be used within them.

Read more >


Introduction

ACUMEN was an EU funded research project aimed at “understanding the ways in which researchers are evaluated by their peers and by institutions, and at assessing how the science system can be improved and enhanced” (1). This project was formed to answer an FP7 call that requested “studying and proposing alternative and broader ways of measuring the productivity and performance of individual researchers including new and improved bibliometric indicators and evaluation criteria for research careers, project evaluations, and scientific publications” (2). FP7 was the Seventh Framework Program of the European Union for the funding of research and technological development in Europe. The ACUMEN Consortium was comprised of nine institutions. The main outputs of the project are the ACUMEN portfolio and the Guidelines for Good Evaluation Practices (both available here).

In the following article we will provide a brief introduction to the portfolio concept and then concentrate on how altmetrics are utilized in the portfolio.

 

The ACUMEN Portfolio

The ACUMEN portfolio allows the researcher to present herself through a brief narrative in which she highlights her past achievements and future goals. This narrative is backed up by structured information available in the sub-portfolios: the expertise, the output and the influence sub-portfolios. For each factor in the sub-portfolios evidence is provided to support the claims. For example, if the researcher claims to have specific methodological expertise, he backs up this claim by providing references to works where this method was applied.

A more detailed description of the three sub-portfolios:

  • In the expertise sub-portfolio there are factors for scientific/scholarly expertise, technological expertise, teaching expertise, knowledge transfer, communication skills and organizational expertise.
  • The output sub-portfolio is comprised of factors for scholarly outputs, teaching outputs, outputs communicated to the general public including online presence and online contributions, datasets, software and tools created by the researcher, patents and grants received.
  • The influence sub-portfolio provides information on citations and various citation-based indicators, scholarly prizes, prizes for teaching, membership in program committees and editorial boards, invited talks, advice given based on subject expertise, economic influence in terms of income, spin-offs, consultancies and patents, textbook sales, download counts of publications and datasets, followers on various social media platforms, Mendeley readership counts, tweets and blog posts about the researcher’s work, views of online presentations, online syllabi mentions and popular articles written about the portfolio owner.

Thus the portfolio provides a holistic view of the researcher’s achievements, expertise and influence. Most of the factors have detailed sub-factors, and information that the portfolio owner is interested in conveying that does not match any of the above-mentioned factors can be provided in the “other” factor of each of the sub-portfolios. Since time spent in academia is crucial for fair evaluations, the ACUMEN project introduced the “academic age”, which is the time from the date the PhD was awarded with allowances for having children, illness and part-time work.

As said above, for each factor/sub-factor evidence is provided to back up the claims. The evidence is not everything that can possibly be listed, but only the “best” evidence for each factor and not more than three items. “Best” is subjectively decided by the researcher creating the portfolio. “Best” is for the specific factor; for example, in the output sub-portfolio the portfolio owner is requested to list his top three journal papers and in the influence sub-portfolio his top three most cited papers. It is possible that a different set of papers is provided for the two factors, in case he considers one of his recent works which has not accrued citations yet to be among his best works, or if he considers one of his less cited works to be among his best contributions.

 

Altmetrics in the Portfolio

As can be seen from the description of the sub-portfolios, online and social media presence and altmetrics are well represented. In the portfolio, online presence is viewed as an output; the researcher is asked to list accounts in social media used for academic purposes, academic network accounts, digital repository accounts and websites that were created or used for dissemination. These include academic social media sites such as ResearchGate and Academia.edu, sites where research outputs can be published such as SlideShare, figshare, YouTube or Vimeo, and also blogs and Twitter accounts. She is also asked to indicate her activity level (e.g. average number of posts per year or month) on these sites.

Altmetrics are even more emphasized in the influence sub-portfolio. The researcher is asked for the number of followers on social media sites, where scholarly information is published or discussed. Examples of such sites are academia.edu, ResearchGate, Twitter and blog(s) maintained by the portfolio owner. The guidelines for filling in the portfolio explain that these numbers should only be provided if viewed substantial.

The researcher is asked to provide details of a maximum of three articles that were tweeted or reviewed in blogs. It was shown recently (3) that articles that are reviewed in science blog posts close to their publication date have a good chance of being cited within three years, and receive more citations than the median number of citations for articles published in the given journal and the given year that were not reviewed in science blogs. Significant associations were also found between higher number of tweets and blog mentions and higher number of citations (4).

For the portfolio the researcher is requested to list download counts for a maximum of three publications. Some publishers provide this information, and download counts are also available for example from academia.edu and ResearchGate. The ACUMEN team is aware that influence cannot be measured through publications only; therefore download counts of the top three most downloaded datasets and software are also requested.

Mendeley readership counts are currently viewed as the most promising altmetric indicator (5). Mendeley has impressive coverage, for example 93% and 94% of the articles published in 2007 in Science and Nature respectively are on Mendeley (6). Similarly, extremely high coverage (97%) was found for articles published in JASIST (Journal of the American Society for Information Science and Technology) between 2001 and 2011 (7). In (5) the coverage of Mendeley for 20,000 random publications was only 63%, but still Mendeley had by far the greatest coverage of all currently studied altmetric sources. In the ACUMEN portfolio, the user is requested to report the number of readers of up to three publications. Mendeley readership counts can possibly be useful in the Social Sciences and the Humanities, where the coverage of the citation databases (WOS and Scopus, but also Google Scholar to a smaller extent) is far from perfect. Mendeley readership counts may also reflect influence in other areas, especially for newly published items that have not received a large number of citations yet, because it takes much longer to cite an item than to be a “reader” of the item. On the other hand, it should be taken into account that it may take some time for a research result to prove its significance; receiving attention in a very early stage does not necessarily mean that the impact is stable over longer time periods. In addition, populations that do not publish in the scholarly system (e.g. students) may also be interested and influenced by scholarly work without being authors (and citers). Mendeley readership counts capture the influence of scholarly work on non-publishing, interested individuals as well. This is supported by correlations of around 0.5 in several works between readership counts and citations – indicating that Mendeley readership counts reflect impact that is different from the impact reflected by citation counts (8). It was shown (9) that PhD students, postgraduates and postdocs are the main readers of articles in Mendeley.

Educational impact can also be measured by altmetrics. Many universities have YouTube channels where they upload videos of lectures (e.g., the YaleCourses YouTube channel). Conferences also often upload videos of talks to the Web, and presentations can uploaded to Slideshare. Interest in the materials available on these sites can be measured by the number of downloads and/or the number of views. Finally, if works of the portfolio owner are referenced in online syllabi this indicates educational impact of her work. Download counts and views of the “top” items in these categories are reported in the portfolio. In addition, the researcher is encouraged to provide details of three interesting web mentions of her, or of her work not mentioned elsewhere. Thus the altmetric data appearing in the portfolio supplement information on the scientific impact and also reflect on the societal impact of the researcher and his work.

 

Discussion and conclusion

Altmetrics is an emerging subfield of informetrics. Currently there are no clear guidelines on how to interpret the altmetric data in the portfolio. This is problematic both for the person filling in the portfolio and for the evaluator receiving portfolios. The best advice ACUMEN can provide at the moment is to compare with other researchers in the same field and at the same career stage. Traditional bibliometrics rely mainly on citations, whereas there are a multitude of altmetric sources. This further complicates interpretation, since we do not know how to (and probably cannot and should not) compare between tweets, downloads, blog mentions and readership counts. We are also aware that some of the altmetric indicators can be manipulated quite easily.

The aim of the ACUMEN Portfolio is to provide a holistic picture of the researcher’s achievements and capabilities. To achieve this aim it is necessary to include as many facets of the achievements as possible. The ACUMEN team believes that altmetric data complement traditional bibliometric data; they indicate influence not necessarily captured by citations, and thus provide additional value.

The ACUMEN portfolio can also be used for self-assessment. The portfolio template is available here (10), and the readers are most welcome to create their own portfolio. But beware: preparing the portfolio is quite time consuming. Have fun!

 

References

(1)    ACUMEN (2014). Retrieved from http://research-acumen.eu/
(2)    European Commission (2009). Work Programme 2010. Capacities. Science in Society.
(3)    Shema, H., Bar-Ilan, J. & Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics”, Journal of the Association for Information Science and Technology, Vol. 65, No. 5, pp.1018-1027.
(4)    Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other candidates”, PLOS ONE, Vol. 8, No. (5), e64841. doi:10.1371/journal.pone.0064841
(5)    Zahedi, Z., Costas, R. & Wouters, P. (in press) “How well developed are altmetrics? Cross disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications”, Scientometrics, DOI:10.1007/s11192-014-1264-0
(6)    Li, X., Thelwall, M. & Giustini, D. (2012) “Validating online reference managers for scholarly impact measurement”, Scientometrics, Vol. 91, No. 2, pp. 461-471.
(7)    Bar-Ilan, J. (2012) “JASIST@Mendeley". Presented at altmetrics12, ACM Web Science Conference 2012 Workshop, Evanston, IL, 21 June 2012. Retrieved from http://altmetrics.org/altmetrics12/bar-ilan/
(8)    Bar-Ilan, J., Shema, H. & Thelwall, M. (2014) “Bibliographic References in Web 2.0”. In: Blaise Cronin and Cassidy R. Sugimoto (Eds.) Beyond Bibliometrics – Harnessing Multidimensional Indicators of Scholarly Impact (pp. 307-325). Cambridge, MA: MIT Press.
(9)    Mohammadi, E., Thelwall, M., Haustein, S. & Larivière, V. (in press) “Who reads research articles? An altmetrics analysis of Mendeley user categories”, Journal of the Association for Information Science and Technology.
(10)http://research-acumen.eu/wp-content/uploads/Blank_AcumenPortfolio.v13x.pdf

 

 

 


 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Introduction

ACUMEN was an EU funded research project aimed at “understanding the ways in which researchers are evaluated by their peers and by institutions, and at assessing how the science system can be improved and enhanced” (1). This project was formed to answer an FP7 call that requested “studying and proposing alternative and broader ways of measuring the productivity and performance of individual researchers including new and improved bibliometric indicators and evaluation criteria for research careers, project evaluations, and scientific publications” (2). FP7 was the Seventh Framework Program of the European Union for the funding of research and technological development in Europe. The ACUMEN Consortium was comprised of nine institutions. The main outputs of the project are the ACUMEN portfolio and the Guidelines for Good Evaluation Practices (both available here).

In the following article we will provide a brief introduction to the portfolio concept and then concentrate on how altmetrics are utilized in the portfolio.

 

The ACUMEN Portfolio

The ACUMEN portfolio allows the researcher to present herself through a brief narrative in which she highlights her past achievements and future goals. This narrative is backed up by structured information available in the sub-portfolios: the expertise, the output and the influence sub-portfolios. For each factor in the sub-portfolios evidence is provided to support the claims. For example, if the researcher claims to have specific methodological expertise, he backs up this claim by providing references to works where this method was applied.

A more detailed description of the three sub-portfolios:

  • In the expertise sub-portfolio there are factors for scientific/scholarly expertise, technological expertise, teaching expertise, knowledge transfer, communication skills and organizational expertise.
  • The output sub-portfolio is comprised of factors for scholarly outputs, teaching outputs, outputs communicated to the general public including online presence and online contributions, datasets, software and tools created by the researcher, patents and grants received.
  • The influence sub-portfolio provides information on citations and various citation-based indicators, scholarly prizes, prizes for teaching, membership in program committees and editorial boards, invited talks, advice given based on subject expertise, economic influence in terms of income, spin-offs, consultancies and patents, textbook sales, download counts of publications and datasets, followers on various social media platforms, Mendeley readership counts, tweets and blog posts about the researcher’s work, views of online presentations, online syllabi mentions and popular articles written about the portfolio owner.

Thus the portfolio provides a holistic view of the researcher’s achievements, expertise and influence. Most of the factors have detailed sub-factors, and information that the portfolio owner is interested in conveying that does not match any of the above-mentioned factors can be provided in the “other” factor of each of the sub-portfolios. Since time spent in academia is crucial for fair evaluations, the ACUMEN project introduced the “academic age”, which is the time from the date the PhD was awarded with allowances for having children, illness and part-time work.

As said above, for each factor/sub-factor evidence is provided to back up the claims. The evidence is not everything that can possibly be listed, but only the “best” evidence for each factor and not more than three items. “Best” is subjectively decided by the researcher creating the portfolio. “Best” is for the specific factor; for example, in the output sub-portfolio the portfolio owner is requested to list his top three journal papers and in the influence sub-portfolio his top three most cited papers. It is possible that a different set of papers is provided for the two factors, in case he considers one of his recent works which has not accrued citations yet to be among his best works, or if he considers one of his less cited works to be among his best contributions.

 

Altmetrics in the Portfolio

As can be seen from the description of the sub-portfolios, online and social media presence and altmetrics are well represented. In the portfolio, online presence is viewed as an output; the researcher is asked to list accounts in social media used for academic purposes, academic network accounts, digital repository accounts and websites that were created or used for dissemination. These include academic social media sites such as ResearchGate and Academia.edu, sites where research outputs can be published such as SlideShare, figshare, YouTube or Vimeo, and also blogs and Twitter accounts. She is also asked to indicate her activity level (e.g. average number of posts per year or month) on these sites.

Altmetrics are even more emphasized in the influence sub-portfolio. The researcher is asked for the number of followers on social media sites, where scholarly information is published or discussed. Examples of such sites are academia.edu, ResearchGate, Twitter and blog(s) maintained by the portfolio owner. The guidelines for filling in the portfolio explain that these numbers should only be provided if viewed substantial.

The researcher is asked to provide details of a maximum of three articles that were tweeted or reviewed in blogs. It was shown recently (3) that articles that are reviewed in science blog posts close to their publication date have a good chance of being cited within three years, and receive more citations than the median number of citations for articles published in the given journal and the given year that were not reviewed in science blogs. Significant associations were also found between higher number of tweets and blog mentions and higher number of citations (4).

For the portfolio the researcher is requested to list download counts for a maximum of three publications. Some publishers provide this information, and download counts are also available for example from academia.edu and ResearchGate. The ACUMEN team is aware that influence cannot be measured through publications only; therefore download counts of the top three most downloaded datasets and software are also requested.

Mendeley readership counts are currently viewed as the most promising altmetric indicator (5). Mendeley has impressive coverage, for example 93% and 94% of the articles published in 2007 in Science and Nature respectively are on Mendeley (6). Similarly, extremely high coverage (97%) was found for articles published in JASIST (Journal of the American Society for Information Science and Technology) between 2001 and 2011 (7). In (5) the coverage of Mendeley for 20,000 random publications was only 63%, but still Mendeley had by far the greatest coverage of all currently studied altmetric sources. In the ACUMEN portfolio, the user is requested to report the number of readers of up to three publications. Mendeley readership counts can possibly be useful in the Social Sciences and the Humanities, where the coverage of the citation databases (WOS and Scopus, but also Google Scholar to a smaller extent) is far from perfect. Mendeley readership counts may also reflect influence in other areas, especially for newly published items that have not received a large number of citations yet, because it takes much longer to cite an item than to be a “reader” of the item. On the other hand, it should be taken into account that it may take some time for a research result to prove its significance; receiving attention in a very early stage does not necessarily mean that the impact is stable over longer time periods. In addition, populations that do not publish in the scholarly system (e.g. students) may also be interested and influenced by scholarly work without being authors (and citers). Mendeley readership counts capture the influence of scholarly work on non-publishing, interested individuals as well. This is supported by correlations of around 0.5 in several works between readership counts and citations – indicating that Mendeley readership counts reflect impact that is different from the impact reflected by citation counts (8). It was shown (9) that PhD students, postgraduates and postdocs are the main readers of articles in Mendeley.

Educational impact can also be measured by altmetrics. Many universities have YouTube channels where they upload videos of lectures (e.g., the YaleCourses YouTube channel). Conferences also often upload videos of talks to the Web, and presentations can uploaded to Slideshare. Interest in the materials available on these sites can be measured by the number of downloads and/or the number of views. Finally, if works of the portfolio owner are referenced in online syllabi this indicates educational impact of her work. Download counts and views of the “top” items in these categories are reported in the portfolio. In addition, the researcher is encouraged to provide details of three interesting web mentions of her, or of her work not mentioned elsewhere. Thus the altmetric data appearing in the portfolio supplement information on the scientific impact and also reflect on the societal impact of the researcher and his work.

 

Discussion and conclusion

Altmetrics is an emerging subfield of informetrics. Currently there are no clear guidelines on how to interpret the altmetric data in the portfolio. This is problematic both for the person filling in the portfolio and for the evaluator receiving portfolios. The best advice ACUMEN can provide at the moment is to compare with other researchers in the same field and at the same career stage. Traditional bibliometrics rely mainly on citations, whereas there are a multitude of altmetric sources. This further complicates interpretation, since we do not know how to (and probably cannot and should not) compare between tweets, downloads, blog mentions and readership counts. We are also aware that some of the altmetric indicators can be manipulated quite easily.

The aim of the ACUMEN Portfolio is to provide a holistic picture of the researcher’s achievements and capabilities. To achieve this aim it is necessary to include as many facets of the achievements as possible. The ACUMEN team believes that altmetric data complement traditional bibliometric data; they indicate influence not necessarily captured by citations, and thus provide additional value.

The ACUMEN portfolio can also be used for self-assessment. The portfolio template is available here (10), and the readers are most welcome to create their own portfolio. But beware: preparing the portfolio is quite time consuming. Have fun!

 

References

(1)    ACUMEN (2014). Retrieved from http://research-acumen.eu/
(2)    European Commission (2009). Work Programme 2010. Capacities. Science in Society.
(3)    Shema, H., Bar-Ilan, J. & Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics”, Journal of the Association for Information Science and Technology, Vol. 65, No. 5, pp.1018-1027.
(4)    Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other candidates”, PLOS ONE, Vol. 8, No. (5), e64841. doi:10.1371/journal.pone.0064841
(5)    Zahedi, Z., Costas, R. & Wouters, P. (in press) “How well developed are altmetrics? Cross disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications”, Scientometrics, DOI:10.1007/s11192-014-1264-0
(6)    Li, X., Thelwall, M. & Giustini, D. (2012) “Validating online reference managers for scholarly impact measurement”, Scientometrics, Vol. 91, No. 2, pp. 461-471.
(7)    Bar-Ilan, J. (2012) “JASIST@Mendeley". Presented at altmetrics12, ACM Web Science Conference 2012 Workshop, Evanston, IL, 21 June 2012. Retrieved from http://altmetrics.org/altmetrics12/bar-ilan/
(8)    Bar-Ilan, J., Shema, H. & Thelwall, M. (2014) “Bibliographic References in Web 2.0”. In: Blaise Cronin and Cassidy R. Sugimoto (Eds.) Beyond Bibliometrics – Harnessing Multidimensional Indicators of Scholarly Impact (pp. 307-325). Cambridge, MA: MIT Press.
(9)    Mohammadi, E., Thelwall, M., Haustein, S. & Larivière, V. (in press) “Who reads research articles? An altmetrics analysis of Mendeley user categories”, Journal of the Association for Information Science and Technology.
(10)http://research-acumen.eu/wp-content/uploads/Blank_AcumenPortfolio.v13x.pdf

 

 

 


 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Gauging openness, measuring impact

In this article, William Gunn discusses the linked concepts of openness and usability as applied to scholarly works. How is openness defined and how is research reused by others?

Read more >


Introduction

This article examines the linked concepts of openness and usability as applied to scholarly works. Openness is used to mean many different things, from transparency about influence when used in a political context, to the lack of restrictions on use when used in a software context. In the scholarly domain, openness generally refers to unrestricted, free availability of a research product over the internet. A work is considered open if there are no permission or price barriers between the work and an individual seeking to make use of the work. However, there are different levels of openness, which are defined by the types of reuse permitted.

Sir Tim Berners Lee introduced the concept of 5 star open data back in 2006 to describe the continuum from a table rendered as a PDF through to data marked up as RDF and connected to the web of Linked Open Data (1). This system clearly explained the benefits of open data by demonstrating how more value was added at each successive step of openness.

Figure 1– Tim Berners Lee’s 5 star open data scale

A similar scenario is presented with scholarly works. The more open it is, the more useful it is to the author and the audience (2). The first level is simply availability online, as opposed to only as a printed copy. The next level is free to read - you can read the paper without any subscription barriers. A work which is explicitly openly licensed is even more open, but the variety of open licenses leaves many works encumbered with provisions that make it impractical to reuse other than on an item-by-item level (3). Using a license without those provisions would be a further level up. This is the level of the accepted standard license for open access works, CC-BY (4). With CC-BY, there are no explicit barriers to reuse, up to the point that simply tracking and attributing all the providers themselves becomes an unmanageable task. The final level would be fully open with no restrictions of any kind, as with CC0. Each of these levels raises the ceiling value for the amount of reuse possible, while making no statement about the desirability of the work, or the sustainability of the access. Simply put, a work that’s more open has, in theory, higher usability than one that is less open. If an open work is also useful to a sufficient number of people, sustainability of access is generally easier to maintain than for closed works through the LOCKSS principle (5), because at least one copy will exist for each researcher who finds it useful. In contrast, closed works can fall into “orphan” status, where reproduction is desired but not permitted, because the rights holder can no longer be identified. Openness is particularly important for works where a long incubation time may be required before the work finds its full potential. Indeed, many great historical works would have been lost were it not for the diligent copying and recopying by centuries of scribes.

 

What kinds of reuse exist?

The ways in which research can be reused can be divided into five general categories based on application: inspiring new research, mining existing data for novel associations, application or implementation, contribution to the popular understanding, and meta-analysis. The various types of reuse and how these can be tracked for discovery and assessment, briefly discussed below, will be the subject of a forthcoming NISO whitepaper.

The first kind of reuse, inspiring new research, is well covered by the traditional databases which track citations, but is limited in that a subsequent piece of research points to a prior piece, but the prior piece does not reciprocally point back to the subsequent research it inspired. This type of reuse is inhibited through lack of access to the research. Additionally, the pointer is at the document level, which gives poor resolution of the details of the reuse. Another needed improvement for understanding citation behavior is to enrich a citation by adding distinguishing characteristics that would allow the different types of citations to be distinguished from one another. See the Citation Typing Ontology (CiTO) for the current work in this area (6).

Tracking mining of datasets, the second category of reuse, is often done via tracking the papers which describe them (7). However, more datasets are appearing on sites such as Figshare and Dryad, which assign DOIs (Digital Object Identifiers) to the data directly (8), instead of just a paper describing the data. Creating URIs (Uniform Research Identifiers) which point to the data directly promotes the data to equal standing with a research paper, because the data can now be referenced directly and can accrue reuse separately from the paper. As with citation of papers, access to data is a barrier to reuse, but technical skills and equipment to handle the data are also needed.

When you move out of the scholarly realm and into applications, there are less explicit mentions of the original works themselves. Detection of a reuse event in a commercial application can be done via looking for references in patent applications or publications arising from academic/industry collaborations, but this only shows first-order impact at best. As you move further away from the publication into the inventions or policies that it may have enabled or informed, the trail gets very difficult to follow, even as the raw number of possible reuse events grows. This is where individual efforts such as the implementation of a Becker Model analysis (9) become necessary, though this is prohibitive to do at scale.

Looking at the reuse of a scholarly work by the public is done much as with an application or implementation. The main source of reuse events in this category are mentions in popular media, although there is a significant “long tail” of lay communities online which discuss research: patient communities, space aficionados, citizen scientists, and teachers in non-professorial roles. Interestingly, PubMed Central reports that the majority of the page views to research papers hosted there come from non-institutional domains (10). Another notable feature of reuse within the public domain is that the direction of flow is reversed: external events such as natural disasters, celebrity endorsements, or other news events often drive increased public reuse events (11, 12), whereas availability of a technology facilitates the application.

Meta-analysis is its own category of reuse. There is a growing movement to conduct and publish replication studies of existing work, such as the Reproducibility Initiative and the Reproducibility Project: Cancer Biology, a partnership between the Reproducibility Initiative and the Center for Open Science. The aims of these projects are to understand and promote replication of research as a type of reuse. The replication studies contain pointers to the original research and explicitly identify which experiments were carried out and what the results were. This enables the creation of a separate discovery layer, to highlight and identify the more reproducible or the most reusable work, facilitating downstream commercial application or reduction to practice.

 

Bootstrapping discovery of reuse

Open Access and Open Data have now become funder priorities across the world. Because funding agencies such as the NIH and Wellcome are now paying for openness in order to maximize the reuse potential of their funded outputs, it has become important be able to aggregate reuse events and to understand their relative impacts. Detecting a reuse event is challenging with current technology, primarily because reuse events don’t always point back to the original item. To serve these needs, the Association for Research Libraries, with funding from Sloan and the Institute for Museum and Library Services, is building the Shared Access Research Ecosystem, an event aggregator, which will consume data sources which report on research events. Additionally, the scholarly metadata organization CrossRef is working on a service called Prospect, which aims to facilitate text and data mining of proprietary content (i.e., the data is open at the one star level, but efforts are made to make it as usable as possible). Together with technologies such as Mendeley and Impact Story, we are developing an ever clearer understanding of the importance and value of openness to the research world and society at large.

 

References

(1) 5 star Open Data. Available at: http://5stardata.info/
(2) Piwowar, H.A., Day, R.B.S. & Fridsma, D.S.B. (2007) “Sharing detailed research data is associated with increased citation rate”, PLOS One, Vol. 2, No. 3, e308.
(3) Dryad (2011) “Why does Dryad use CC0?”, Dryad news and views on WordPress.com. Available at: http://blog.datadryad.org/2011/10/05/why-does-dryad-use-cc0/
(4) Open Access Scholarly Publishers Association (2012) “Why CC-BY?”. Available at: http://oaspa.org/why-cc-by
(5) LOCKSS, “Preservation Principles”. Available at: http://www.lockss.org/about/principles
(6) S. (2013), “CiTO, the Citation Typing Ontology”. Available at: http://www.essepuntato.it/lode/http://purl.org/spar/cito
(7) Piwowar, H.A. (2010) PhD Thesis: “Foundational studies for measuring the impact, prevalence, and patterns of publicly sharing biomedical research data”, D-Scholarship@Pitt (Database). Available at: http://etd.library.pitt.edu/ETD/available/etd-04152010-115953/
(8) Piwowar, H.A. & Vision, T.J. (2013) Data from: “Data reuse and the open data citation advantage”. Available at: http://doi.org/10.5061/dryad.781pv
(9) Holmes, K.L. & Sarli, C.C. (2013) “The Becker Medical Library Model for assessment of research impact – an Interview with Cathy C. Sarli and Kristi L. Holmes”, Research Trends, Issue 34. Available at: https://www.researchtrends.com/issue-34-september-2013/the-becker-medical-library-model/
(10) Plutchak, T.S. (2005) “The impact of open access”, J. Med. Libr. Assoc., Vol. 93, No. 4, pp. 419-421.
(11) Bauer, M.W., Allum, N. & Miller, S. (2007) “What can we learn from 25 years of PUS survey research? Liberating and expanding the agenda”, Public Underst. Sci., Vol. 16, No. 1, pp. 79–95.
(12) Monzen, S. et al. (2011) “Individual radiation exposure dose due to support activities at safe shelters in Fukushima Prefecture”, PLOS One, Vol. 6, e27761.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Introduction

This article examines the linked concepts of openness and usability as applied to scholarly works. Openness is used to mean many different things, from transparency about influence when used in a political context, to the lack of restrictions on use when used in a software context. In the scholarly domain, openness generally refers to unrestricted, free availability of a research product over the internet. A work is considered open if there are no permission or price barriers between the work and an individual seeking to make use of the work. However, there are different levels of openness, which are defined by the types of reuse permitted.

Sir Tim Berners Lee introduced the concept of 5 star open data back in 2006 to describe the continuum from a table rendered as a PDF through to data marked up as RDF and connected to the web of Linked Open Data (1). This system clearly explained the benefits of open data by demonstrating how more value was added at each successive step of openness.

Figure 1– Tim Berners Lee’s 5 star open data scale

A similar scenario is presented with scholarly works. The more open it is, the more useful it is to the author and the audience (2). The first level is simply availability online, as opposed to only as a printed copy. The next level is free to read - you can read the paper without any subscription barriers. A work which is explicitly openly licensed is even more open, but the variety of open licenses leaves many works encumbered with provisions that make it impractical to reuse other than on an item-by-item level (3). Using a license without those provisions would be a further level up. This is the level of the accepted standard license for open access works, CC-BY (4). With CC-BY, there are no explicit barriers to reuse, up to the point that simply tracking and attributing all the providers themselves becomes an unmanageable task. The final level would be fully open with no restrictions of any kind, as with CC0. Each of these levels raises the ceiling value for the amount of reuse possible, while making no statement about the desirability of the work, or the sustainability of the access. Simply put, a work that’s more open has, in theory, higher usability than one that is less open. If an open work is also useful to a sufficient number of people, sustainability of access is generally easier to maintain than for closed works through the LOCKSS principle (5), because at least one copy will exist for each researcher who finds it useful. In contrast, closed works can fall into “orphan” status, where reproduction is desired but not permitted, because the rights holder can no longer be identified. Openness is particularly important for works where a long incubation time may be required before the work finds its full potential. Indeed, many great historical works would have been lost were it not for the diligent copying and recopying by centuries of scribes.

 

What kinds of reuse exist?

The ways in which research can be reused can be divided into five general categories based on application: inspiring new research, mining existing data for novel associations, application or implementation, contribution to the popular understanding, and meta-analysis. The various types of reuse and how these can be tracked for discovery and assessment, briefly discussed below, will be the subject of a forthcoming NISO whitepaper.

The first kind of reuse, inspiring new research, is well covered by the traditional databases which track citations, but is limited in that a subsequent piece of research points to a prior piece, but the prior piece does not reciprocally point back to the subsequent research it inspired. This type of reuse is inhibited through lack of access to the research. Additionally, the pointer is at the document level, which gives poor resolution of the details of the reuse. Another needed improvement for understanding citation behavior is to enrich a citation by adding distinguishing characteristics that would allow the different types of citations to be distinguished from one another. See the Citation Typing Ontology (CiTO) for the current work in this area (6).

Tracking mining of datasets, the second category of reuse, is often done via tracking the papers which describe them (7). However, more datasets are appearing on sites such as Figshare and Dryad, which assign DOIs (Digital Object Identifiers) to the data directly (8), instead of just a paper describing the data. Creating URIs (Uniform Research Identifiers) which point to the data directly promotes the data to equal standing with a research paper, because the data can now be referenced directly and can accrue reuse separately from the paper. As with citation of papers, access to data is a barrier to reuse, but technical skills and equipment to handle the data are also needed.

When you move out of the scholarly realm and into applications, there are less explicit mentions of the original works themselves. Detection of a reuse event in a commercial application can be done via looking for references in patent applications or publications arising from academic/industry collaborations, but this only shows first-order impact at best. As you move further away from the publication into the inventions or policies that it may have enabled or informed, the trail gets very difficult to follow, even as the raw number of possible reuse events grows. This is where individual efforts such as the implementation of a Becker Model analysis (9) become necessary, though this is prohibitive to do at scale.

Looking at the reuse of a scholarly work by the public is done much as with an application or implementation. The main source of reuse events in this category are mentions in popular media, although there is a significant “long tail” of lay communities online which discuss research: patient communities, space aficionados, citizen scientists, and teachers in non-professorial roles. Interestingly, PubMed Central reports that the majority of the page views to research papers hosted there come from non-institutional domains (10). Another notable feature of reuse within the public domain is that the direction of flow is reversed: external events such as natural disasters, celebrity endorsements, or other news events often drive increased public reuse events (11, 12), whereas availability of a technology facilitates the application.

Meta-analysis is its own category of reuse. There is a growing movement to conduct and publish replication studies of existing work, such as the Reproducibility Initiative and the Reproducibility Project: Cancer Biology, a partnership between the Reproducibility Initiative and the Center for Open Science. The aims of these projects are to understand and promote replication of research as a type of reuse. The replication studies contain pointers to the original research and explicitly identify which experiments were carried out and what the results were. This enables the creation of a separate discovery layer, to highlight and identify the more reproducible or the most reusable work, facilitating downstream commercial application or reduction to practice.

 

Bootstrapping discovery of reuse

Open Access and Open Data have now become funder priorities across the world. Because funding agencies such as the NIH and Wellcome are now paying for openness in order to maximize the reuse potential of their funded outputs, it has become important be able to aggregate reuse events and to understand their relative impacts. Detecting a reuse event is challenging with current technology, primarily because reuse events don’t always point back to the original item. To serve these needs, the Association for Research Libraries, with funding from Sloan and the Institute for Museum and Library Services, is building the Shared Access Research Ecosystem, an event aggregator, which will consume data sources which report on research events. Additionally, the scholarly metadata organization CrossRef is working on a service called Prospect, which aims to facilitate text and data mining of proprietary content (i.e., the data is open at the one star level, but efforts are made to make it as usable as possible). Together with technologies such as Mendeley and Impact Story, we are developing an ever clearer understanding of the importance and value of openness to the research world and society at large.

 

References

(1) 5 star Open Data. Available at: http://5stardata.info/
(2) Piwowar, H.A., Day, R.B.S. & Fridsma, D.S.B. (2007) “Sharing detailed research data is associated with increased citation rate”, PLOS One, Vol. 2, No. 3, e308.
(3) Dryad (2011) “Why does Dryad use CC0?”, Dryad news and views on WordPress.com. Available at: http://blog.datadryad.org/2011/10/05/why-does-dryad-use-cc0/
(4) Open Access Scholarly Publishers Association (2012) “Why CC-BY?”. Available at: http://oaspa.org/why-cc-by
(5) LOCKSS, “Preservation Principles”. Available at: http://www.lockss.org/about/principles
(6) S. (2013), “CiTO, the Citation Typing Ontology”. Available at: http://www.essepuntato.it/lode/http://purl.org/spar/cito
(7) Piwowar, H.A. (2010) PhD Thesis: “Foundational studies for measuring the impact, prevalence, and patterns of publicly sharing biomedical research data”, D-Scholarship@Pitt (Database). Available at: http://etd.library.pitt.edu/ETD/available/etd-04152010-115953/
(8) Piwowar, H.A. & Vision, T.J. (2013) Data from: “Data reuse and the open data citation advantage”. Available at: http://doi.org/10.5061/dryad.781pv
(9) Holmes, K.L. & Sarli, C.C. (2013) “The Becker Medical Library Model for assessment of research impact – an Interview with Cathy C. Sarli and Kristi L. Holmes”, Research Trends, Issue 34. Available at: https://www.researchtrends.com/issue-34-september-2013/the-becker-medical-library-model/
(10) Plutchak, T.S. (2005) “The impact of open access”, J. Med. Libr. Assoc., Vol. 93, No. 4, pp. 419-421.
(11) Bauer, M.W., Allum, N. & Miller, S. (2007) “What can we learn from 25 years of PUS survey research? Liberating and expanding the agenda”, Public Underst. Sci., Vol. 16, No. 1, pp. 79–95.
(12) Monzen, S. et al. (2011) “Individual radiation exposure dose due to support activities at safe shelters in Fukushima Prefecture”, PLOS One, Vol. 6, e27761.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Science without borders: are technology and policy limiting internationalization?

A conversation between Juan Pablo Alperin and Mike Taylor on altmetrics, Latin American research and ways to improve international scholarly communication.

Read more >


JuanPabloAlperin4x6Juan Pablo Alperin (@juancommander) is a PhD candidate in the Stanford Graduate School of Education and a researcher and systems developer with the Public Knowledge Project (PKP). Juan leads several research and development projects on improving the quality, impact, and reach of Latin American research, and is currently studying the alternative and public impact of open access.

Mike Taylor is a researcher with Elsevier Labs.

MT: Juan, I heard you speak last year at the PLOS Article Level Metrics workshop in San Francisco. You gave a very powerful presentation on some of the problems facing researchers and journals based in the developing world. In particular, I was struck by your observation that when the developing world decides to innovate the use of things that we take for granted - for example the Impact Factor or DOIs (Digital Object Identifiers) - we effectively exclude many researchers who don’t have access. In your recent blog posting (1), you state that only 4% of Latin American journals are indexed by Web of Science (WoS), and that it’s argued that the excluded journals don’t fall into the “mainstream” of science. To what extent do you feel that the category of mainstream is defined by access to technology?

JPA: I do not think that "mainstream science" is itself defined by access to technology. Scholarship is a networked process, which naturally lends itself well to a core-periphery framing. It is not my preferred characterization, but one that is arguably a reality. That is, if we were to network all the literature or form a network of all those contributing to scholarship, we may be able to identify that there is, in fact, a core which could be said to be the “mainstream”.

What has been achieved through technology is to demarcate what should be considered for inclusion in that overall network; for example, if your articles are contained in an abstract and index database such as Scopus or WoS, then your work can be entered into citation analysis and therefore be considered part of the mainstream. To make matters worse for those that lack access, technologies provide a way of essentially excluding in a way that appears to be democratic and objective, but is actually far from being either.

This is not to say that technology cannot also be used for eliminating boundaries. Google Scholar is an example that offers results from small, independent journals next to those from large commercial publishers in a way that blurs the distinction between the two. It is not uncommon to find technology optimists who think that all technologies are equally unifying. The reality, however, is that access to technology can just as easily foment a false dichotomy, creating two classes of scholars (those that have access and those that do not), with the consequence that the scholarship of those in the latter group is perceived to be inherently less valuable.

MT: At a conference in Mexico recently, I heard a speech from Abel Packer, SciELO Brazil (2) on the threat that emerging mega-journals may have for local research journals. In short, the argument was that while these new platforms are more attractive to researchers (they provide international visibility with and access to DOIs, JIF, etc., whilst frequently being able to waive fees), the inevitable migration will lead to a decrease in the use of local journals. And that as these become less popular and less attractive to authors (particularly those writing in English), the potential loss of local journals will result in a loss of a valuable part of the academic infrastructure - for example, editorial boards, peer-review, conferences and workshops. Do you share this concern, or is the gradual death of local publishing inevitable? What do local journals have to offer that mega-journals do not?

JPA: Local, institutional, and student journals serve as an important learning ground for novice scholars to learn the ropes about communicating scholarship and, as you mention, they play a critical role in the research infrastructure. Their demise would be tragic: it would weaken research culture, yield more of the research agenda to those running mega-journals, and eliminate the necessary stepping stones for scholars to improve their research communication to the standards of their international peers. Given their critical importance, yes, I do worry about their decline.

However, I do not think it is inevitable or even imminent, at least not in Latin America, although there is definitely a risk. The funding model in Latin America has been very different than in the North. Currently, APCs (article processing charges) are virtually non-existent and most journals are funded through public funds (primarily funds channeled through public universities). So far, government agencies have been reluctant to shift financing from local journals to APCs, and I hope it remains this way. Unlike subscriptions or APCs, the current financial model in Latin America excludes neither reader nor writer. That said, if the APC model becomes the only model for Open Access elsewhere, it may begin to take hold within Latin America. If that happens, then international mega-journals will likely end up killing the local journals.

MT: I’m curious on the independence of this form of funding in Latin America - the extent to which it’s subject to governmental policy or not. Generally speaking, do the funds that support journals come directly from Government, or are there intermediate bodies - research councils, or organizations similar to the UK’s JISC (an independent body that is neither for-profit nor purely governmental, but which exists to support an independent academic infrastructure)?

JPA: We did a survey of journals some years back (3), and I know there have been other studies that corroborate the general finding, that the majority of journals in Latin America receive support from their university, most of which are themselves publicly funded. I believe a lot of it comes as in-kind support from the university (server space, technical staff, etc.). Science councils also play a big role, as they set incentive structures for researchers, write guidelines or define lists of “approved” journals, special support programs, and sometimes provide financial or technical support directly to individual journals.

MT: When it comes to building infrastructure, or developing a higher international profile, is there a potential advantage in more regionalism? For example, I know that there are attempts to share platforms between countries that have similar cultures, for example, Scandinavian and Baltic countries. Does a shared regional infrastructure make collaboration within the region more likely? Obviously an Ecuadorean researcher is going to be more interested in child obesity in Mexico than in (for example) Lithuania or the US, but do you feel that there is a beneficial regional level of collaboration that has yet to be explored - or should we just push for complete internationalization?

JPA: A shared research interest is only one reason for regionalism. Regional collaboration and a shared regional infrastructure also take advantage of similar economic models, incentive structures, levels of technical expertise, and a shared research culture. The potential is not just increased collaboration in the form of co-authorships, but also in avoiding duplicate efforts and benefiting from economies of scale.

Some great examples of this can be seen in Latin America, including the two major initiatives, SciELO and RedALyC.org. But even there, a lot more could be done. These platforms are taking advantage of economies of scale to increase visibility and are centralizing some of the technical aspects of publishing, but as of yet they still have done little to increase collaboration between scholars, build a network of copy- and layout-editors, share personnel, or otherwise bring together those working in the publishing process.

MT: Do you think it would be sensible to work towards having a regional impact factor (Latin American Impact Factor, African IF, etc.) using journal level analysis (even if not the traditional JIF formulae), or would that risk the ghettoization of developing world publishing?

JPA: I don’t think it makes sense to create regional versions of a journal-level citation metric. I think the critiques of the IF, including that of those that have backed DORA (Declaration on Research Assessment), would still equally apply to each of these instances. Moreover, they would create the same problems I have been describing, but in the reverse: they would exclude research published outside of the region and therefore penalize researchers who are publishing locally, but are being read and cited from outside the region.

SciELO provides citation counts and an IF for journals contained within SciELO (4), but I do not think the metric has been widely used, and it certainly has not supplanted the view that Thomson-Reuters’ IF is the one that “matters.”

The purpose of regional portals has to be to improve quality, gain efficiencies, and increase visibility, not to isolate the regions into systems that are completely decoupled from the rest of the world.

MT: Much of the work in altmetrics falls into two categories at the moment: finding patterns between different social networks (for example Twitter and Mendeley), and looking for the relationship between altmetrics and citation. Needless to say, this work focusses on looking for DOIs and the resolving URLs, and this will obviously exclude any article without a DOI. Furthermore, Impactstory.org has recently adopted Altmetric.com’s Twitter feed, and this has had the effect of removing the ability to look for tweets linking to a non-DOIed article’s URL. What can we - as researchers interested in altmetrics - do to extend the focus of our research to the developing world? To what extent do we need to look at regional variations in platforms (for example, we know that some cultures use Facebook in a more scholarly way than Twitter, and that some countries - most notably China - have a strong cultural or politically mandated preference for their own platforms, e.g., Weibo)? Would the development of local language versions of research tools or a movement towards a community-driven identification of local language blogging and review sites be positive in extending the focus of altmetrics to the developing world?

JPA: As you mention, the dependence on DOIs is by far the most limiting aspect for studying altmetrics in developing regions. Despite CrossRef’s efforts (and to be fair, I do believe they are making a concerted effort), DOIs are still not commonplace everywhere. For many journals, even in medium income countries, the US$1.00 per article fee remains prohibitive. As long as this is the case, and as long as altmetric tools rely on DOIs, it will be impossible to evaluate altmetrics on a large scale for journals running on low budgets.

As I mentioned in my talk at ALM 13 (5), there is a strong parallel between the use of WoS for evaluation and the use of altmetrics dependent on DOIs. If only Tweets to articles with DOIs can be studied, then scholars publishing in venues without DOIs will be once again discounted. An altmetric provider that works for arbitrary URLs is therefore absolutely necessary (funding agencies, tool builders, and altmetric providers: take note!).

Second, we need studies that look at altmetrics, even in the two ways you describe above, for a set of journals from developing regions, even if we start with those that do have DOIs. The existing studies have almost exclusively focused on well-resourced journals from the global North. It is possible, and even likely, that the patterns are different a) for journals with lower visibility; and b) where the use of social Web tools is different (as you allude to above). The focus on journals from publishers like Nature and PLOS sets expectations and guides the research agenda on altmetrics.

With such studies, we would at least know the levels of penetration in the currently studied platforms, and to what extent they differ between journals. I think you are right that consultations with scholars from other parts of the world may turn up other sources that are useful for other communities.

I should mention that these issues are important enough to me that they are the focus of my dissertation work. With the help of SciELO, RedALyC, and Altmetric.com, I am studying download, citation, and altmetrics data for Latin American journals. Euan Adie from Altmetric.com has been kind enough to provide special handling for a set of URLs, so that it is possible to have altmetrics on those, even if they do not have DOIs. I will be releasing some preliminary results soon (stay tuned to my Twitter feed, @juancommander). I hope to reveal some of the ways in which altmetrics vary between contexts, and open new lines of research into these alternative metrics.

MT: How can international organizations - whether not-for-profits, like CrossRef, Orcid, PLOS, or commercial companies such as Thomson-Reuters, Elsevier or Altmetric.com - work with the developing world so they can increase their visibility and access to global infrastructure, while permitting their regional and national characteristics to thrive?

JPA: Those aiming to improve scholarly communications, including those international organizations you mention, must remember that access to the scholarly communication infrastructure is often not a technological limitation. Much of the time, it is other factors, such as an editorial decision on part of Thomson-Reuters and Elsevier that prevents a journal from being indexed, or a lack of finances that limits the use of DOIs.

Giving access to the existing infrastructure is a first step, but it is not enough. The next step, if we take our global/international commitment seriously, is to be willing to make changes to that infrastructure: a) by being as acutely aware as possible of the ways in which scholars from developing regions are disadvantaged by the existing models and tools; and b) by consulting and actively engaging with scholar communities in developing regions.

MT: Juan, thanks for taking the time to answer my questions. Perhaps you’d be kind enough to write a piece on some of your findings for a later issue of RT!

JPA: Thank you for your interest, and thank you for posing questions that gave me the opportunity to talk about issues that are important to me.

References

(1) http://blogs.lse.ac.uk/impactofsocialsciences/2014/03/10/altmetrics-for-developing-regions
 
(2) http://blog.scielo.org/en/2013/07/15/38/#.UzXnUV4tXVI
 
(3)http://www.itidjournal.org/index.php/itid/article/download/639/274
 
(4) http://www.scielo.org/applications/scielo-org/php/citations.php
 
(5) https://speakerdeck.com/jalperin/altmetrics-propagating-global-inequality
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

JuanPabloAlperin4x6Juan Pablo Alperin (@juancommander) is a PhD candidate in the Stanford Graduate School of Education and a researcher and systems developer with the Public Knowledge Project (PKP). Juan leads several research and development projects on improving the quality, impact, and reach of Latin American research, and is currently studying the alternative and public impact of open access.

Mike Taylor is a researcher with Elsevier Labs.

MT: Juan, I heard you speak last year at the PLOS Article Level Metrics workshop in San Francisco. You gave a very powerful presentation on some of the problems facing researchers and journals based in the developing world. In particular, I was struck by your observation that when the developing world decides to innovate the use of things that we take for granted - for example the Impact Factor or DOIs (Digital Object Identifiers) - we effectively exclude many researchers who don’t have access. In your recent blog posting (1), you state that only 4% of Latin American journals are indexed by Web of Science (WoS), and that it’s argued that the excluded journals don’t fall into the “mainstream” of science. To what extent do you feel that the category of mainstream is defined by access to technology?

JPA: I do not think that "mainstream science" is itself defined by access to technology. Scholarship is a networked process, which naturally lends itself well to a core-periphery framing. It is not my preferred characterization, but one that is arguably a reality. That is, if we were to network all the literature or form a network of all those contributing to scholarship, we may be able to identify that there is, in fact, a core which could be said to be the “mainstream”.

What has been achieved through technology is to demarcate what should be considered for inclusion in that overall network; for example, if your articles are contained in an abstract and index database such as Scopus or WoS, then your work can be entered into citation analysis and therefore be considered part of the mainstream. To make matters worse for those that lack access, technologies provide a way of essentially excluding in a way that appears to be democratic and objective, but is actually far from being either.

This is not to say that technology cannot also be used for eliminating boundaries. Google Scholar is an example that offers results from small, independent journals next to those from large commercial publishers in a way that blurs the distinction between the two. It is not uncommon to find technology optimists who think that all technologies are equally unifying. The reality, however, is that access to technology can just as easily foment a false dichotomy, creating two classes of scholars (those that have access and those that do not), with the consequence that the scholarship of those in the latter group is perceived to be inherently less valuable.

MT: At a conference in Mexico recently, I heard a speech from Abel Packer, SciELO Brazil (2) on the threat that emerging mega-journals may have for local research journals. In short, the argument was that while these new platforms are more attractive to researchers (they provide international visibility with and access to DOIs, JIF, etc., whilst frequently being able to waive fees), the inevitable migration will lead to a decrease in the use of local journals. And that as these become less popular and less attractive to authors (particularly those writing in English), the potential loss of local journals will result in a loss of a valuable part of the academic infrastructure - for example, editorial boards, peer-review, conferences and workshops. Do you share this concern, or is the gradual death of local publishing inevitable? What do local journals have to offer that mega-journals do not?

JPA: Local, institutional, and student journals serve as an important learning ground for novice scholars to learn the ropes about communicating scholarship and, as you mention, they play a critical role in the research infrastructure. Their demise would be tragic: it would weaken research culture, yield more of the research agenda to those running mega-journals, and eliminate the necessary stepping stones for scholars to improve their research communication to the standards of their international peers. Given their critical importance, yes, I do worry about their decline.

However, I do not think it is inevitable or even imminent, at least not in Latin America, although there is definitely a risk. The funding model in Latin America has been very different than in the North. Currently, APCs (article processing charges) are virtually non-existent and most journals are funded through public funds (primarily funds channeled through public universities). So far, government agencies have been reluctant to shift financing from local journals to APCs, and I hope it remains this way. Unlike subscriptions or APCs, the current financial model in Latin America excludes neither reader nor writer. That said, if the APC model becomes the only model for Open Access elsewhere, it may begin to take hold within Latin America. If that happens, then international mega-journals will likely end up killing the local journals.

MT: I’m curious on the independence of this form of funding in Latin America - the extent to which it’s subject to governmental policy or not. Generally speaking, do the funds that support journals come directly from Government, or are there intermediate bodies - research councils, or organizations similar to the UK’s JISC (an independent body that is neither for-profit nor purely governmental, but which exists to support an independent academic infrastructure)?

JPA: We did a survey of journals some years back (3), and I know there have been other studies that corroborate the general finding, that the majority of journals in Latin America receive support from their university, most of which are themselves publicly funded. I believe a lot of it comes as in-kind support from the university (server space, technical staff, etc.). Science councils also play a big role, as they set incentive structures for researchers, write guidelines or define lists of “approved” journals, special support programs, and sometimes provide financial or technical support directly to individual journals.

MT: When it comes to building infrastructure, or developing a higher international profile, is there a potential advantage in more regionalism? For example, I know that there are attempts to share platforms between countries that have similar cultures, for example, Scandinavian and Baltic countries. Does a shared regional infrastructure make collaboration within the region more likely? Obviously an Ecuadorean researcher is going to be more interested in child obesity in Mexico than in (for example) Lithuania or the US, but do you feel that there is a beneficial regional level of collaboration that has yet to be explored - or should we just push for complete internationalization?

JPA: A shared research interest is only one reason for regionalism. Regional collaboration and a shared regional infrastructure also take advantage of similar economic models, incentive structures, levels of technical expertise, and a shared research culture. The potential is not just increased collaboration in the form of co-authorships, but also in avoiding duplicate efforts and benefiting from economies of scale.

Some great examples of this can be seen in Latin America, including the two major initiatives, SciELO and RedALyC.org. But even there, a lot more could be done. These platforms are taking advantage of economies of scale to increase visibility and are centralizing some of the technical aspects of publishing, but as of yet they still have done little to increase collaboration between scholars, build a network of copy- and layout-editors, share personnel, or otherwise bring together those working in the publishing process.

MT: Do you think it would be sensible to work towards having a regional impact factor (Latin American Impact Factor, African IF, etc.) using journal level analysis (even if not the traditional JIF formulae), or would that risk the ghettoization of developing world publishing?

JPA: I don’t think it makes sense to create regional versions of a journal-level citation metric. I think the critiques of the IF, including that of those that have backed DORA (Declaration on Research Assessment), would still equally apply to each of these instances. Moreover, they would create the same problems I have been describing, but in the reverse: they would exclude research published outside of the region and therefore penalize researchers who are publishing locally, but are being read and cited from outside the region.

SciELO provides citation counts and an IF for journals contained within SciELO (4), but I do not think the metric has been widely used, and it certainly has not supplanted the view that Thomson-Reuters’ IF is the one that “matters.”

The purpose of regional portals has to be to improve quality, gain efficiencies, and increase visibility, not to isolate the regions into systems that are completely decoupled from the rest of the world.

MT: Much of the work in altmetrics falls into two categories at the moment: finding patterns between different social networks (for example Twitter and Mendeley), and looking for the relationship between altmetrics and citation. Needless to say, this work focusses on looking for DOIs and the resolving URLs, and this will obviously exclude any article without a DOI. Furthermore, Impactstory.org has recently adopted Altmetric.com’s Twitter feed, and this has had the effect of removing the ability to look for tweets linking to a non-DOIed article’s URL. What can we - as researchers interested in altmetrics - do to extend the focus of our research to the developing world? To what extent do we need to look at regional variations in platforms (for example, we know that some cultures use Facebook in a more scholarly way than Twitter, and that some countries - most notably China - have a strong cultural or politically mandated preference for their own platforms, e.g., Weibo)? Would the development of local language versions of research tools or a movement towards a community-driven identification of local language blogging and review sites be positive in extending the focus of altmetrics to the developing world?

JPA: As you mention, the dependence on DOIs is by far the most limiting aspect for studying altmetrics in developing regions. Despite CrossRef’s efforts (and to be fair, I do believe they are making a concerted effort), DOIs are still not commonplace everywhere. For many journals, even in medium income countries, the US$1.00 per article fee remains prohibitive. As long as this is the case, and as long as altmetric tools rely on DOIs, it will be impossible to evaluate altmetrics on a large scale for journals running on low budgets.

As I mentioned in my talk at ALM 13 (5), there is a strong parallel between the use of WoS for evaluation and the use of altmetrics dependent on DOIs. If only Tweets to articles with DOIs can be studied, then scholars publishing in venues without DOIs will be once again discounted. An altmetric provider that works for arbitrary URLs is therefore absolutely necessary (funding agencies, tool builders, and altmetric providers: take note!).

Second, we need studies that look at altmetrics, even in the two ways you describe above, for a set of journals from developing regions, even if we start with those that do have DOIs. The existing studies have almost exclusively focused on well-resourced journals from the global North. It is possible, and even likely, that the patterns are different a) for journals with lower visibility; and b) where the use of social Web tools is different (as you allude to above). The focus on journals from publishers like Nature and PLOS sets expectations and guides the research agenda on altmetrics.

With such studies, we would at least know the levels of penetration in the currently studied platforms, and to what extent they differ between journals. I think you are right that consultations with scholars from other parts of the world may turn up other sources that are useful for other communities.

I should mention that these issues are important enough to me that they are the focus of my dissertation work. With the help of SciELO, RedALyC, and Altmetric.com, I am studying download, citation, and altmetrics data for Latin American journals. Euan Adie from Altmetric.com has been kind enough to provide special handling for a set of URLs, so that it is possible to have altmetrics on those, even if they do not have DOIs. I will be releasing some preliminary results soon (stay tuned to my Twitter feed, @juancommander). I hope to reveal some of the ways in which altmetrics vary between contexts, and open new lines of research into these alternative metrics.

MT: How can international organizations - whether not-for-profits, like CrossRef, Orcid, PLOS, or commercial companies such as Thomson-Reuters, Elsevier or Altmetric.com - work with the developing world so they can increase their visibility and access to global infrastructure, while permitting their regional and national characteristics to thrive?

JPA: Those aiming to improve scholarly communications, including those international organizations you mention, must remember that access to the scholarly communication infrastructure is often not a technological limitation. Much of the time, it is other factors, such as an editorial decision on part of Thomson-Reuters and Elsevier that prevents a journal from being indexed, or a lack of finances that limits the use of DOIs.

Giving access to the existing infrastructure is a first step, but it is not enough. The next step, if we take our global/international commitment seriously, is to be willing to make changes to that infrastructure: a) by being as acutely aware as possible of the ways in which scholars from developing regions are disadvantaged by the existing models and tools; and b) by consulting and actively engaging with scholar communities in developing regions.

MT: Juan, thanks for taking the time to answer my questions. Perhaps you’d be kind enough to write a piece on some of your findings for a later issue of RT!

JPA: Thank you for your interest, and thank you for posing questions that gave me the opportunity to talk about issues that are important to me.

References

(1) http://blogs.lse.ac.uk/impactofsocialsciences/2014/03/10/altmetrics-for-developing-regions
 
(2) http://blog.scielo.org/en/2013/07/15/38/#.UzXnUV4tXVI
 
(3)http://www.itidjournal.org/index.php/itid/article/download/639/274
 
(4) http://www.scielo.org/applications/scielo-org/php/citations.php
 
(5) https://speakerdeck.com/jalperin/altmetrics-propagating-global-inequality
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

The grey literature from an altmetrics perspective – opportunity and challenges

Euan Adie describes the great opportunities grey literature provide for alternative metrics, and the social and technical challenges that are involved.

Read more >


The field of altmetrics encompasses both alternative metrics (data beyond citation counts or impact factors) and alternative research outputs (like datasets and software).

But some material falls into both camps.

Grey literature - theses, posters, preprints, patents and policy documents and similar - are created by researchers and informed by research but aren’t usually viewed as first class citizens of the scholarly literature. They are not all tracked in citation indexes like Web of Science or Scopus and can be difficult to cite in academic journals, with some editors discouraging any formal citation of preprints and similar types of document. For example, the Oxford Journals author guidelines (1) states that the reference section must not include manuscripts not formally accepted for publication, e.g. preprints. There can be good reasons for this, which we’ll explore further later in the article.

The term ‘grey literature’ comes from their position in the fuzzy grey area between academic and popular literature (2). Importantly they are resources that aren't typically controlled by academic publishers, traditionally the gatekeepers of the scholarly record. Publishers generally take this role seriously, and there is an established technical infrastructure as well as standard processes to support them doing so. It is reasonable nowadays to expect the majority of publishers to belong to an ethics program like COPE, to assign unique and persistent identifiers like DOIs and to participate in long term archiving projects like CLOCKSS (Controlled LOCKSS).  This is a not-for-profit joint venture between the academic publishers and research libraries with the ambition of developing a sustainable, geographically distributed dark archive to ensure the long-term survival of Web-based scholarly publications (www.clockss.org).

No such infrastructure or processes exist for grey literature. That is part of their appeal: you can upload a preprint or present a poster without having to go through a lengthy peer review, typesetting and publishing process, or publish a report without having to go through an intermediary. It is unfortunately also a hindrance to anybody trying to mine or analyze them. Analyzing them is exactly what altmetrics initiatives should be trying to do, because policy documents and patents are potentially very interesting indicators of impact beyond scholarly impact.

 

The opportunity

It’s not hard to imagine some use cases illustrating why altmetrics groups might want to get a handle on the subject:

  • If my research is on the economic impact of river flooding then citations in other journals aren’t the only thing that’s important to me. I want to be kept aware of government policy that cites my work, too.
  • If my work is referenced by a patent in a completely different field, I’d like to know about it.
  • When looking at research outputs of my department, I don’t just care about peer-reviewed research in journals, but patents, reports and policy documents too.

Being cited as evidence in a government policy report isn't impact in and of itself - perhaps the report will be locked in a filing cabinet and never acted on. It is still a valuable indicator, though, that's not easily obtainable anywhere else. It’s not unusual for even the authors of a paper to not know about everywhere that their work has turned up.

 

The challenges

Discovering what research the grey literature material cites is just one potential opportunity to enrich impact analysis, but the challenges are fairly formidable. We’ve spent a lot of time and effort on building up systems to track, parse and analyze policy documents and patents and some of the more interesting challenges we’ve faced are:

  • Identifying relevant documents
  • Extracting metadata & references
  • Permanence

Let’s look at each one in turn.

Identifying relevant documents

The first challenge to mining grey literature is simply to find it.

It is a publisher’s job (at least traditionally) to disseminate research, and there is a well-established ecosystem of discovery tools and indexing services to help individuals find and access scholarly literature that is relevant to them.

There is no such ecosystem for the grey literature, though valuable initiatives like greylit.org can give researchers a head start.

Without knowing even how much grey literature material is created each year, let alone by whom, it is difficult to make assumptions about how complete any index you may build is.

Extracting metadata & references

Once relevant documents are found, you ideally want to associate basic bibliographic metadata with them – a title, some authors, a publication date.

Central databases like CrossRef or PubMed can help do this for traditional literature, returning bibliographic records originally supplied by the publisher when queried by a unique identifier.

Policy documents, to take one example, have no such canonical metadata available and they have often been published online in ways that make automatic metadata extraction impractical. A government report may be provided only as a typeset PDF, with the title and authors (if mentioned at all) in a graphic on the first page.

For the purposes of altmetrics we are interested in the research that documents cite, and common practice in scholarly articles is to keep these to a single references section.

There is often no such common practice for grey literature, where references can be in figure captions, in footnotes, tables, or separate appendices to name but a few common scenarios. Furthermore, without manual curation it is hard to figure out what’s a citation at all in the traditional sense of the word: we have come across medical guidelines that explicitly list out papers that may have seemed relevant but were not used in any way.

Permanence

A core principle shared by most altmetrics groups is that the raw data that any numbers or assertions are based on should be available to the end user.

So if we are to report that a particular policy document links to a paper then we need to make sure that users can get to that policy document.

This leads to a couple of classic online publishing problems: firstly will we always be able to find the document again in the place we found it and secondly will the document always be available online.

There is nothing to stop an NGO or government agency from redesigning its website, shifting its online publications to a different part of the site and breaking all our links. There is also no ‘dark archive’ of documents to ensure that we will always have a copy even if the group that originally created it ceases to exist.

How does the grey literature fit in with other altmetrics?

One oft-mentioned advantage of altmetrics indicators is that they are usually high volume and quick to accrue, with the first data being collected within hours of publication instead of months as is usually the case with citations.

Citations to papers from policy documents buck this trend, where, anecdotally, we have seen that most of the biomedical papers referenced are five or more years older than the policy document itself – this is even slower than you might expect from traditional literature.

It is possible to imagine the attention paid to at least biomedical research on a continuum (see table 1).

Within: Hours Days Months Years
Activity seen:

Altmetrics:
First mention on social media

Altmetrics:
First pickups on blogs & in news outlets
Bibliometrics:
First citations in the rest of the literature
Altmetrics:
First appearances in policy documents

Table 1: the attention potentially paid to research on a continuum

Why might citations from policy documents only appear years after a paper is published? We don’t know, though it would be interesting to find out. One possibility is that it takes a long time for some types of policy document or report to actually get published, so the citations are to papers that may have actually been relatively new when the authors were still discussing whatever issue the document is addressing.

How could we improve things?

The flexibility of grey literature is a strength but also a weakness. The grey literature lacks many of the important pieces of infrastructure and best practices used by academic publishers.

Might it be possible to pull over some of the good things from academic publishing workflows, without losing too many of the benefits of occasionally being able to opt out of scholarly publishing processes?

A few key changes to the way grey literature is produced would make life much easier for anybody interested in the altmetrics that they might provide, though these must be balanced with the needs of creators who may have little interest in metrics of any kind and so lack the motivation to support change.

Use of persistent identifiers

Use of something like the Handle System (in which resources are assigned a unique identifier that can be resolved to a URL by the creator) would help ensure that groups can track documents even if they move around the internet.

Minimum standards of metadata

The best way to add basic metadata to scholarly PDFs and web pages is a problem that publishers solved long ago. PRISM (Publisher Requirements for Industry Standard Metadata) is a publisher driven initiative to agree on a standard set of metadata for academic publications (see idea alliance for more detail). Dublin Core is a broad set of standard metadata terms that can be applied to documents, videos, images and other resources. They provide standard ontologies; in PDFs these can be inserted using authoring tools or, after creation, using XMP which is a standard way of adding metadata to images and PDFs. On web pages the publishing industry has settled on <meta> tags, not least because for many journals this is a prerequisite for indexing by Google Scholar.

An index of the grey literature

An open, central index of scholarly grey literature that enforced a minimum level of metadata for each item could make searching and linking documents much easier for tool makers and help the groups authoring them with discoverability (as users would have one place to look for relevant documents) and attribution.

An alternative would be to maintain a central index of grey literature repositories – the websites of each group authoring them, perhaps – and to allow harvesting from each through a standard like OAI-PMH (Open Archives Initiatives – Protocol for Metadata Harvesting), already well adopted by institutional repositories and open access publishers.

This would allow third parties to independently provide centralized tools to search or preserve content held on each group’s website, making it easier to track and discover documents.

Conclusion

The grey literature presents great opportunities for alternative metrics, providing data and indicators that cannot be found anywhere else.

Those opportunities come with great challenges, both social and technical. To work with grey literature, tools need some basic infrastructure to be put in place, but is this something that authors really want or will it compromise the advantages of publishing grey literature in the first place?

 

References

(1) http://www.oxfordjournals.org/our_journals/molbev/general_author_guidelines.html#References

(2) Auger, C.P., Ed. (1989) Information Sources in Grey Literature (2nd ed.). London: Bowker-Saur. ISBN 0862918715.

 

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

The field of altmetrics encompasses both alternative metrics (data beyond citation counts or impact factors) and alternative research outputs (like datasets and software).

But some material falls into both camps.

Grey literature - theses, posters, preprints, patents and policy documents and similar - are created by researchers and informed by research but aren’t usually viewed as first class citizens of the scholarly literature. They are not all tracked in citation indexes like Web of Science or Scopus and can be difficult to cite in academic journals, with some editors discouraging any formal citation of preprints and similar types of document. For example, the Oxford Journals author guidelines (1) states that the reference section must not include manuscripts not formally accepted for publication, e.g. preprints. There can be good reasons for this, which we’ll explore further later in the article.

The term ‘grey literature’ comes from their position in the fuzzy grey area between academic and popular literature (2). Importantly they are resources that aren't typically controlled by academic publishers, traditionally the gatekeepers of the scholarly record. Publishers generally take this role seriously, and there is an established technical infrastructure as well as standard processes to support them doing so. It is reasonable nowadays to expect the majority of publishers to belong to an ethics program like COPE, to assign unique and persistent identifiers like DOIs and to participate in long term archiving projects like CLOCKSS (Controlled LOCKSS).  This is a not-for-profit joint venture between the academic publishers and research libraries with the ambition of developing a sustainable, geographically distributed dark archive to ensure the long-term survival of Web-based scholarly publications (www.clockss.org).

No such infrastructure or processes exist for grey literature. That is part of their appeal: you can upload a preprint or present a poster without having to go through a lengthy peer review, typesetting and publishing process, or publish a report without having to go through an intermediary. It is unfortunately also a hindrance to anybody trying to mine or analyze them. Analyzing them is exactly what altmetrics initiatives should be trying to do, because policy documents and patents are potentially very interesting indicators of impact beyond scholarly impact.

 

The opportunity

It’s not hard to imagine some use cases illustrating why altmetrics groups might want to get a handle on the subject:

  • If my research is on the economic impact of river flooding then citations in other journals aren’t the only thing that’s important to me. I want to be kept aware of government policy that cites my work, too.
  • If my work is referenced by a patent in a completely different field, I’d like to know about it.
  • When looking at research outputs of my department, I don’t just care about peer-reviewed research in journals, but patents, reports and policy documents too.

Being cited as evidence in a government policy report isn't impact in and of itself - perhaps the report will be locked in a filing cabinet and never acted on. It is still a valuable indicator, though, that's not easily obtainable anywhere else. It’s not unusual for even the authors of a paper to not know about everywhere that their work has turned up.

 

The challenges

Discovering what research the grey literature material cites is just one potential opportunity to enrich impact analysis, but the challenges are fairly formidable. We’ve spent a lot of time and effort on building up systems to track, parse and analyze policy documents and patents and some of the more interesting challenges we’ve faced are:

  • Identifying relevant documents
  • Extracting metadata & references
  • Permanence

Let’s look at each one in turn.

Identifying relevant documents

The first challenge to mining grey literature is simply to find it.

It is a publisher’s job (at least traditionally) to disseminate research, and there is a well-established ecosystem of discovery tools and indexing services to help individuals find and access scholarly literature that is relevant to them.

There is no such ecosystem for the grey literature, though valuable initiatives like greylit.org can give researchers a head start.

Without knowing even how much grey literature material is created each year, let alone by whom, it is difficult to make assumptions about how complete any index you may build is.

Extracting metadata & references

Once relevant documents are found, you ideally want to associate basic bibliographic metadata with them – a title, some authors, a publication date.

Central databases like CrossRef or PubMed can help do this for traditional literature, returning bibliographic records originally supplied by the publisher when queried by a unique identifier.

Policy documents, to take one example, have no such canonical metadata available and they have often been published online in ways that make automatic metadata extraction impractical. A government report may be provided only as a typeset PDF, with the title and authors (if mentioned at all) in a graphic on the first page.

For the purposes of altmetrics we are interested in the research that documents cite, and common practice in scholarly articles is to keep these to a single references section.

There is often no such common practice for grey literature, where references can be in figure captions, in footnotes, tables, or separate appendices to name but a few common scenarios. Furthermore, without manual curation it is hard to figure out what’s a citation at all in the traditional sense of the word: we have come across medical guidelines that explicitly list out papers that may have seemed relevant but were not used in any way.

Permanence

A core principle shared by most altmetrics groups is that the raw data that any numbers or assertions are based on should be available to the end user.

So if we are to report that a particular policy document links to a paper then we need to make sure that users can get to that policy document.

This leads to a couple of classic online publishing problems: firstly will we always be able to find the document again in the place we found it and secondly will the document always be available online.

There is nothing to stop an NGO or government agency from redesigning its website, shifting its online publications to a different part of the site and breaking all our links. There is also no ‘dark archive’ of documents to ensure that we will always have a copy even if the group that originally created it ceases to exist.

How does the grey literature fit in with other altmetrics?

One oft-mentioned advantage of altmetrics indicators is that they are usually high volume and quick to accrue, with the first data being collected within hours of publication instead of months as is usually the case with citations.

Citations to papers from policy documents buck this trend, where, anecdotally, we have seen that most of the biomedical papers referenced are five or more years older than the policy document itself – this is even slower than you might expect from traditional literature.

It is possible to imagine the attention paid to at least biomedical research on a continuum (see table 1).

Within: Hours Days Months Years
Activity seen:

Altmetrics:
First mention on social media

Altmetrics:
First pickups on blogs & in news outlets
Bibliometrics:
First citations in the rest of the literature
Altmetrics:
First appearances in policy documents

Table 1: the attention potentially paid to research on a continuum

Why might citations from policy documents only appear years after a paper is published? We don’t know, though it would be interesting to find out. One possibility is that it takes a long time for some types of policy document or report to actually get published, so the citations are to papers that may have actually been relatively new when the authors were still discussing whatever issue the document is addressing.

How could we improve things?

The flexibility of grey literature is a strength but also a weakness. The grey literature lacks many of the important pieces of infrastructure and best practices used by academic publishers.

Might it be possible to pull over some of the good things from academic publishing workflows, without losing too many of the benefits of occasionally being able to opt out of scholarly publishing processes?

A few key changes to the way grey literature is produced would make life much easier for anybody interested in the altmetrics that they might provide, though these must be balanced with the needs of creators who may have little interest in metrics of any kind and so lack the motivation to support change.

Use of persistent identifiers

Use of something like the Handle System (in which resources are assigned a unique identifier that can be resolved to a URL by the creator) would help ensure that groups can track documents even if they move around the internet.

Minimum standards of metadata

The best way to add basic metadata to scholarly PDFs and web pages is a problem that publishers solved long ago. PRISM (Publisher Requirements for Industry Standard Metadata) is a publisher driven initiative to agree on a standard set of metadata for academic publications (see idea alliance for more detail). Dublin Core is a broad set of standard metadata terms that can be applied to documents, videos, images and other resources. They provide standard ontologies; in PDFs these can be inserted using authoring tools or, after creation, using XMP which is a standard way of adding metadata to images and PDFs. On web pages the publishing industry has settled on <meta> tags, not least because for many journals this is a prerequisite for indexing by Google Scholar.

An index of the grey literature

An open, central index of scholarly grey literature that enforced a minimum level of metadata for each item could make searching and linking documents much easier for tool makers and help the groups authoring them with discoverability (as users would have one place to look for relevant documents) and attribution.

An alternative would be to maintain a central index of grey literature repositories – the websites of each group authoring them, perhaps – and to allow harvesting from each through a standard like OAI-PMH (Open Archives Initiatives – Protocol for Metadata Harvesting), already well adopted by institutional repositories and open access publishers.

This would allow third parties to independently provide centralized tools to search or preserve content held on each group’s website, making it easier to track and discover documents.

Conclusion

The grey literature presents great opportunities for alternative metrics, providing data and indicators that cannot be found anywhere else.

Those opportunities come with great challenges, both social and technical. To work with grey literature, tools need some basic infrastructure to be put in place, but is this something that authors really want or will it compromise the advantages of publishing grey literature in the first place?

 

References

(1) http://www.oxfordjournals.org/our_journals/molbev/general_author_guidelines.html#References

(2) Auger, C.P., Ed. (1989) Information Sources in Grey Literature (2nd ed.). London: Bowker-Saur. ISBN 0862918715.

 

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Downloads versus citations and the role of publication language

Vicente P. Guerrero-Bote and Félix Moya-Anegón examined the relations between downloads and citations and the effect of publication language on them.

Read more >


Since scientific literature is now published and distributed mainly online, a number of initiatives have been developed to attempt to measure scientific impact from download data. Such data would allow scientific activity to be observed immediately after publication, rather than having to wait for the citations. Shepherd (1) and Bollen et al. (2) propose a Download Impact Factor as a journal metric. It consists of the average download rates of articles published in a journal, similar to the citation-based Journal Impact Factor (JIF). COUNTER (3) define as standard a Journal Usage Factor using the median rather than the mean. Bollen et al. (2, 4) have demonstrated the feasibility of a variety of social network metrics calculated from the download networks extracted from the information contained in the clicks recorded in download logs.

Bollen et al. (5) conducted a principal component analysis of the rankings of journals produced by 39 measures of academic impact calculated from both citation and download log data. Their results indicate that the notion of scientific impact is multi-dimensional, and cannot be adequately measured by a single indicator, although some might be more suitable than others. In particular, they observed greater significance with indicators based on downloads, possibly because of the great amount of download data that can be collected.

Although Kurtz et al. (6) show how the citation obsolescence function (7) and readership follow similar trajectories over time, Schloegl & Gorraiz (8, 9) find that downloads and citations have different patterns of obsolescence. While Darmoni et al. (10) and Bollen et al. (5) report that a journal's download frequency does not to any great degree correspond with the impact factor, Schloegl & Gorraiz (9) calculate a strong correlation at the journal level between citation and download frequency when absolute values are used, and a moderate to strong correlation between the number of downloads and the journal impact factor. In this sense too, Wan et al. (11) define a download immediacy index.

 

Download as predictor of citation

In recent papers (12, 13) we have used data from Scopus (citations) and ScienceDirect (downloads) to study the relationship between Downloads and Citations and the influence of publication language. Therefore we studied these parameters for the journals in non-English languages in ScienceDirect, specifically, for those with more than 95% of their articles in French, German, or Spanish in the period 2003-2011. We also defined a control group of journals in English in order to establish the differences with the non-English language journals. For each non-English journal, we selected as control at least one English-language journal that was present in both databases, that belonged to the same Specific Subject Area, and had a similar number of published articles. To look deeper into the phenomenon, we compared the geographical origins of the downloads and of the citations of the two groups. It must be noted that the set of German- and Spanish-language journals is too small to draw any significant separate conclusions.

Scopus and ScienceDirect cover different numbers of papers. This is because the latter includes all papers, while the former does not include Conference/Meeting Abstracts or Book Reviews. The divergence between them is mainly due to the Conference/Meeting Abstracts. The time-obsolescence curves of citations and downloads differ (see Figure 1). One appreciates the effect in the former of the time it takes for a paper to be cited, and in the latter of novelty in the downloads. The proportional difference between the downloads received by Reviews and other document types increases relative to the citations.

Figure 1- Left panel: Mean primary citations for Scopus document types by age of the document in years. Right panel: Mean downloads of the main document types corresponding to Scopus by age in years after the online publication date. Comparing the data for "excellent" papers (solid lines) with those for other papers (dashed lines).

The "excellent" papers (those belonging to the top 10% cited in the corresponding Specific Subject Area, document type, and year) (14) showed a great difference in mean downloads with respect to the non-excellent papers throughout the period. The percentage difference was greater both at the end of the period and for the document types of medium or low download levels.

The order of the Subject Areas in mean citation does not coincide with that in mean downloads: while Psychology was always behind Medicine in citations, it was always ahead of Medicine in downloads. This may reflect different habits in different areas, with some areas seeming to read proportionally more than they cite.

There were positive and statistically significant correlations between downloads and citations by journal and by age in years for the entire set of journals, both English and non-English (0.77 on average), but these were greatly reduced both in value and in statistical significance in the case of the non-English language journals.

In the control journals, it seems that there is a novelty effect at the beginning, with there being many downloads that do not result in citations. This may be the reason that the correlations are weakest in the first year after publication. Interestingly, the strongest correlations are found in the seventh year after publication. This may correspond to when researchers are looking for a specific paper, probably redirected by some citation.

The correlations at the level of individual papers are considerably weaker (0.42 on average) than those at journal level, but markedly more significant statistically because of the far greater sample size. Nonetheless, the relative weakness of the correlation (around 55% of the correlations of the journals) may be indicative that the number of downloads, besides being a function of the quality of the paper (reflected in its citations), largely depends on the diffusion of the journal and on the effect of novelty itself. Thus, articles published in journals of wide circulation and diffusion, with high mean impact, have many downloads, even though for some papers this does not lead to many citations. Also, works published in journals of lower mean impact have fewer downloads, regardless of whether or not some of those papers later receive many citations.

All this means that the potential usefulness of download data as a predictor of citation is limited, especially so given that it is in the early years when the significance is the lowest. This circumstance was even more marked in the case of non-English language journals.

 

Origin of Download/Citation and language

Figure 2 reveals that the control journals are downloaded proportionally slightly less than they are cited by the most productive countries. Instead, the non-English journals studied are downloaded proportionally more than twice as much as they are cited. This may reflect that a part of the citation impact of these non-English language journals is invisible to Scopus, because the authors who download those papers cite them in articles published in journals that are not indexed in Scopus. For example, Belgium has a percentage of downloads of control journals that is 42% less than the percentage of citation to the same journals, while having a percentage of downloads from the non-English journals which is 242% higher than the percentage citation to these journals.

Figure 2- Plot, for the 27 greatest scientific production countries, on the vertical axis the ratio of the percentage of downloads from the control journals and the percentage of citations to these journals, against on the horizontal axis the ratio of the percentage of downloads from the French, German, and Spanish journals and the percentage of citations to these non-English journals. The area of each circle is proportional to that country's total number of downloads.

In the most productive countries, there is an association between the control journals' citations or downloads and a proportional increase in their downloads relative to their citations. This is to say that users who frequently download or cite the control journals download them proportionally more than they cite them. This effect is not observed for the non-English language journals studied.

In francophone regions, there is a proportionally greater decrease of downloads from control journals than of citations to those journals. In the German and Spanish language cases, the equivalent results have little significance because of the very few journals involved, some of which have been loaded into ScienceDirect retrospectively.

In sum, there seems to be a part of the citation impact of non-English language journals that is invisible to Scopus, which makes the number of downloads proportionately greater than the citations. This also has its effect on the lack of correlation between downloads and citations in these non-English journals, which means that if one wants to predict the citation rate for these titles, it will be difficult to use download data to do so.

Acknowledgments

This work was supported by Elsevier as part of the Elsevier Bibliometric Research Program (EBRP), and financed by the Junta de Extremadura, Consejería de Empleo, Empresa e Innovación and by the Fondo Social Europeo as part of the research group grant GR10019.

 

References

(1) Shepherd, P.T. (2007) “The feasibility of developing and implementing journal usage factors: a research project sponsored by UKSG”, Serials: The Journal for the Serials Community, Vol. 20, No. 2, pp. 117-123.
(2) Bollen, J., Van de Sompel, H. and Rodriguez, M.A. (2008) “Towards usage-based impact metrics: First results from the MESUR project”. In Joint Conference on Digital Libraries (JCDL2006), Pittsburgh, PA, June 2008.
(3) COUNTER (2014). “Usage Factor a COUNTER standard”. Available at: http://www.projectcounter.org/documents/Draft_UF_R1.pdf.
(4) Bollen, J., Van de Sompel, H., Smith, J. and Luce, R. (2005) “Toward alternative metrics of journal impact: a comparison of download and citation data”, Information Processing and Management, Vol. 41, No. 6, pp. 1419-1440.
(5) Bollen, J., Van de Sompel, H., Hagberg, A, and Chute, R. (2009) “A principal component analysis of 39 scientific impact measures”, PLOS ONE, Vol. 4, No. 6: e6022. doi:10.1371/journal.pone.0006022.
(6) Kurtz, M.J., Eichhorn, G., Accomazzi, A., Grant, C.S., Demleitner, M. & Murray, S.S. (2005) “The bibliometric properties of article readership information”, Journal of the American Society for Information Science and Technology, Vol. 56, pp. 111-128.
(7) Egghe, L. & Rousseau, R. (2000) “Aging, obsolescence, impact, growth, and utilization: Definitions and relations”, Journal of the American Society for Information Science, Vol. 51, No. 11, pp. 1004–1017.
(8) Schloegl, C. & Gorraiz, J. (2010) “Comparison of citation and usage indicators: The case of oncology journals”, Scientometrics, Vol. 82, No. 3, pp. 567–580.
(9) Schloegl, C. & Gorraiz, J. (2011) “Global Usage Versus Global Citation Metrics: The Case of Pharmacology Journals”, Journal of the American Society for Information Science and Technology, Vol. 62, No. 1, pp. 161–170.
(10) Darmoni, S.J., Roussel, F., Benichou, J., Faure, G.C., Thirion, B. & Pinhas, N. (2000) “Reading factor as a credible alternative to impact factor: a preliminary study”, Technol. Health Care, Vol. 8, No. 3-4, pp. 174–175.
(11) Wan, J.-K., Hua, P.-H., Rousseau, R. & Sun, X.-K. (2010) “The journal download immediacy index (DII): Experiences using a Chinese full-text database”, Scientometrics, Vol. 82, No. 3, pp. 555–566.
(12) Guerrero-Bote, V.P. & Moya-Anegón, F. (2013) “Relationship between Downloads and Citation and the influence of language”. In: J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger & H. Moed (Eds.), Proceedings of the 14th International Conference on Scientometrics and Informetrics—ISSI 2013 (pp. 1469–1484). Vienna: Austrian Institute of Technology.
(13) Guerrero-Bote, V.P. & Moya-Anegón, F. (2014) “Relationship between Downloads and Citations at Journal and Paper Levels, and the Influence of Language”, Scientometrics (in press).
(14) Bornmann, L., Moya-Anegón, F. & Leydesdorff, L. (2012) “The new excellence indicator in the world report of the SCImago institutions rankings 2011”, Journal of Informetrics, Vol. 6, No. 2, pp. 333-335
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Since scientific literature is now published and distributed mainly online, a number of initiatives have been developed to attempt to measure scientific impact from download data. Such data would allow scientific activity to be observed immediately after publication, rather than having to wait for the citations. Shepherd (1) and Bollen et al. (2) propose a Download Impact Factor as a journal metric. It consists of the average download rates of articles published in a journal, similar to the citation-based Journal Impact Factor (JIF). COUNTER (3) define as standard a Journal Usage Factor using the median rather than the mean. Bollen et al. (2, 4) have demonstrated the feasibility of a variety of social network metrics calculated from the download networks extracted from the information contained in the clicks recorded in download logs.

Bollen et al. (5) conducted a principal component analysis of the rankings of journals produced by 39 measures of academic impact calculated from both citation and download log data. Their results indicate that the notion of scientific impact is multi-dimensional, and cannot be adequately measured by a single indicator, although some might be more suitable than others. In particular, they observed greater significance with indicators based on downloads, possibly because of the great amount of download data that can be collected.

Although Kurtz et al. (6) show how the citation obsolescence function (7) and readership follow similar trajectories over time, Schloegl & Gorraiz (8, 9) find that downloads and citations have different patterns of obsolescence. While Darmoni et al. (10) and Bollen et al. (5) report that a journal's download frequency does not to any great degree correspond with the impact factor, Schloegl & Gorraiz (9) calculate a strong correlation at the journal level between citation and download frequency when absolute values are used, and a moderate to strong correlation between the number of downloads and the journal impact factor. In this sense too, Wan et al. (11) define a download immediacy index.

 

Download as predictor of citation

In recent papers (12, 13) we have used data from Scopus (citations) and ScienceDirect (downloads) to study the relationship between Downloads and Citations and the influence of publication language. Therefore we studied these parameters for the journals in non-English languages in ScienceDirect, specifically, for those with more than 95% of their articles in French, German, or Spanish in the period 2003-2011. We also defined a control group of journals in English in order to establish the differences with the non-English language journals. For each non-English journal, we selected as control at least one English-language journal that was present in both databases, that belonged to the same Specific Subject Area, and had a similar number of published articles. To look deeper into the phenomenon, we compared the geographical origins of the downloads and of the citations of the two groups. It must be noted that the set of German- and Spanish-language journals is too small to draw any significant separate conclusions.

Scopus and ScienceDirect cover different numbers of papers. This is because the latter includes all papers, while the former does not include Conference/Meeting Abstracts or Book Reviews. The divergence between them is mainly due to the Conference/Meeting Abstracts. The time-obsolescence curves of citations and downloads differ (see Figure 1). One appreciates the effect in the former of the time it takes for a paper to be cited, and in the latter of novelty in the downloads. The proportional difference between the downloads received by Reviews and other document types increases relative to the citations.

Figure 1- Left panel: Mean primary citations for Scopus document types by age of the document in years. Right panel: Mean downloads of the main document types corresponding to Scopus by age in years after the online publication date. Comparing the data for "excellent" papers (solid lines) with those for other papers (dashed lines).

The "excellent" papers (those belonging to the top 10% cited in the corresponding Specific Subject Area, document type, and year) (14) showed a great difference in mean downloads with respect to the non-excellent papers throughout the period. The percentage difference was greater both at the end of the period and for the document types of medium or low download levels.

The order of the Subject Areas in mean citation does not coincide with that in mean downloads: while Psychology was always behind Medicine in citations, it was always ahead of Medicine in downloads. This may reflect different habits in different areas, with some areas seeming to read proportionally more than they cite.

There were positive and statistically significant correlations between downloads and citations by journal and by age in years for the entire set of journals, both English and non-English (0.77 on average), but these were greatly reduced both in value and in statistical significance in the case of the non-English language journals.

In the control journals, it seems that there is a novelty effect at the beginning, with there being many downloads that do not result in citations. This may be the reason that the correlations are weakest in the first year after publication. Interestingly, the strongest correlations are found in the seventh year after publication. This may correspond to when researchers are looking for a specific paper, probably redirected by some citation.

The correlations at the level of individual papers are considerably weaker (0.42 on average) than those at journal level, but markedly more significant statistically because of the far greater sample size. Nonetheless, the relative weakness of the correlation (around 55% of the correlations of the journals) may be indicative that the number of downloads, besides being a function of the quality of the paper (reflected in its citations), largely depends on the diffusion of the journal and on the effect of novelty itself. Thus, articles published in journals of wide circulation and diffusion, with high mean impact, have many downloads, even though for some papers this does not lead to many citations. Also, works published in journals of lower mean impact have fewer downloads, regardless of whether or not some of those papers later receive many citations.

All this means that the potential usefulness of download data as a predictor of citation is limited, especially so given that it is in the early years when the significance is the lowest. This circumstance was even more marked in the case of non-English language journals.

 

Origin of Download/Citation and language

Figure 2 reveals that the control journals are downloaded proportionally slightly less than they are cited by the most productive countries. Instead, the non-English journals studied are downloaded proportionally more than twice as much as they are cited. This may reflect that a part of the citation impact of these non-English language journals is invisible to Scopus, because the authors who download those papers cite them in articles published in journals that are not indexed in Scopus. For example, Belgium has a percentage of downloads of control journals that is 42% less than the percentage of citation to the same journals, while having a percentage of downloads from the non-English journals which is 242% higher than the percentage citation to these journals.

Figure 2- Plot, for the 27 greatest scientific production countries, on the vertical axis the ratio of the percentage of downloads from the control journals and the percentage of citations to these journals, against on the horizontal axis the ratio of the percentage of downloads from the French, German, and Spanish journals and the percentage of citations to these non-English journals. The area of each circle is proportional to that country's total number of downloads.

In the most productive countries, there is an association between the control journals' citations or downloads and a proportional increase in their downloads relative to their citations. This is to say that users who frequently download or cite the control journals download them proportionally more than they cite them. This effect is not observed for the non-English language journals studied.

In francophone regions, there is a proportionally greater decrease of downloads from control journals than of citations to those journals. In the German and Spanish language cases, the equivalent results have little significance because of the very few journals involved, some of which have been loaded into ScienceDirect retrospectively.

In sum, there seems to be a part of the citation impact of non-English language journals that is invisible to Scopus, which makes the number of downloads proportionately greater than the citations. This also has its effect on the lack of correlation between downloads and citations in these non-English journals, which means that if one wants to predict the citation rate for these titles, it will be difficult to use download data to do so.

Acknowledgments

This work was supported by Elsevier as part of the Elsevier Bibliometric Research Program (EBRP), and financed by the Junta de Extremadura, Consejería de Empleo, Empresa e Innovación and by the Fondo Social Europeo as part of the research group grant GR10019.

 

References

(1) Shepherd, P.T. (2007) “The feasibility of developing and implementing journal usage factors: a research project sponsored by UKSG”, Serials: The Journal for the Serials Community, Vol. 20, No. 2, pp. 117-123.
(2) Bollen, J., Van de Sompel, H. and Rodriguez, M.A. (2008) “Towards usage-based impact metrics: First results from the MESUR project”. In Joint Conference on Digital Libraries (JCDL2006), Pittsburgh, PA, June 2008.
(3) COUNTER (2014). “Usage Factor a COUNTER standard”. Available at: http://www.projectcounter.org/documents/Draft_UF_R1.pdf.
(4) Bollen, J., Van de Sompel, H., Smith, J. and Luce, R. (2005) “Toward alternative metrics of journal impact: a comparison of download and citation data”, Information Processing and Management, Vol. 41, No. 6, pp. 1419-1440.
(5) Bollen, J., Van de Sompel, H., Hagberg, A, and Chute, R. (2009) “A principal component analysis of 39 scientific impact measures”, PLOS ONE, Vol. 4, No. 6: e6022. doi:10.1371/journal.pone.0006022.
(6) Kurtz, M.J., Eichhorn, G., Accomazzi, A., Grant, C.S., Demleitner, M. & Murray, S.S. (2005) “The bibliometric properties of article readership information”, Journal of the American Society for Information Science and Technology, Vol. 56, pp. 111-128.
(7) Egghe, L. & Rousseau, R. (2000) “Aging, obsolescence, impact, growth, and utilization: Definitions and relations”, Journal of the American Society for Information Science, Vol. 51, No. 11, pp. 1004–1017.
(8) Schloegl, C. & Gorraiz, J. (2010) “Comparison of citation and usage indicators: The case of oncology journals”, Scientometrics, Vol. 82, No. 3, pp. 567–580.
(9) Schloegl, C. & Gorraiz, J. (2011) “Global Usage Versus Global Citation Metrics: The Case of Pharmacology Journals”, Journal of the American Society for Information Science and Technology, Vol. 62, No. 1, pp. 161–170.
(10) Darmoni, S.J., Roussel, F., Benichou, J., Faure, G.C., Thirion, B. & Pinhas, N. (2000) “Reading factor as a credible alternative to impact factor: a preliminary study”, Technol. Health Care, Vol. 8, No. 3-4, pp. 174–175.
(11) Wan, J.-K., Hua, P.-H., Rousseau, R. & Sun, X.-K. (2010) “The journal download immediacy index (DII): Experiences using a Chinese full-text database”, Scientometrics, Vol. 82, No. 3, pp. 555–566.
(12) Guerrero-Bote, V.P. & Moya-Anegón, F. (2013) “Relationship between Downloads and Citation and the influence of language”. In: J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger & H. Moed (Eds.), Proceedings of the 14th International Conference on Scientometrics and Informetrics—ISSI 2013 (pp. 1469–1484). Vienna: Austrian Institute of Technology.
(13) Guerrero-Bote, V.P. & Moya-Anegón, F. (2014) “Relationship between Downloads and Citations at Journal and Paper Levels, and the Influence of Language”, Scientometrics (in press).
(14) Bornmann, L., Moya-Anegón, F. & Leydesdorff, L. (2012) “The new excellence indicator in the world report of the SCImago institutions rankings 2011”, Journal of Informetrics, Vol. 6, No. 2, pp. 333-335
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

A comparison of citations, downloads and readership data for an information systems journal

In this contribution, Christian Schlögl and his co-authors present the similarities & differences between citations, downloads & readership data, and the relations between them.

Read more >


Introduction

In the past, citations were the prime source for measuring scholarly impact. With the advent of altmetrics, it is possible to detect the use and consumption of scholarly publishing on a much broader basis (1). According to Plum Analytics, besides citations, metrics can be provided on the basis of usage, captures, mentions, and social media (2). In this contribution we will elaborate on the similarities and differences between one example from each of the first three metrics types mentioned above: citations from Scopus; downloads from ScienceDirect; and readership counts from Mendeley. As a use case, we chose the Information Systems journal Information and Management, including all issues from 2002 to 2011.

Information and Management is one of the leading Information Systems journals. It usually publishes eight issues per year and has a geographical focus on Anglo-American and South East Asian countries with regard to authorship and associate editors. From the nearly 600 research articles in the period of analysis, half were published by authors from the U.S. and approximately one third by authors from Taiwan, China, South Korea and Singapore.

Citations and downloads were provided by Elsevier in the framework of the Elsevier Bibliometric Research Program (EBRP) (3). For the publications of the analyzed Information Systems journal all monthly downloads were made available from ScienceDirect (4) and all monthly citations from Scopus (5). Furthermore, we received the readership counts from Mendeley (6). Mendeley is a social reference management system which helps users with the organization of their personal research libraries. The articles, provided by users around the world, are crowd-sourced into a single collection called the Mendeley research catalogue. This makes it possible to calculate the readership frequencies of an article which indicates how many Mendeley users have added it to their personal research library. At the time of writing, this catalogue contains more than 110 million unique articles, crowd-sourced from over 2.5 million users, making it an interesting source of data for large scale network analysis.

 

Relation between citations, downloads and readership counts

Figure 1 shows the relationship between downloads, citations and readership frequencies for all full-length articles (7) published between 2002 and 2011. Data were provided mid 2012 for citations and downloads and in October 2012 for readership data. As can be seen, articles that are downloaded more often are in general cited more frequently. Furthermore, the more frequently an article can be found in Mendeley user libraries (number of readers), the more often it is usually downloaded and cited.

Figure 1 - Downloads vs. cites vs. readers (publication year: 2002-2011, doc type: full-length article)

This is also reflected through the rank correlations (Spearman) among these three indicators, which are 0.76between citations and downloads, 0.66between downloads and readership counts, and 0.59 between citations and readership counts. Similar correlations were computed for another Information Systems journal (Journal of Strategic Information Systems) (8). The fact that there is a strong but not a perfect correlation between these three indicators gives a first indication that they measure partly different aspects of scholarly communication. Therefore, we will look deeper into each measure. In a first step, we will investigate possible differences in obsolescence characteristics. Since Mendeley started only in 2009 and had a high growth in its user base since then, we will perform the obsolescence analysis only for citations and downloads.

 

Obsolescence characteristics of citations and downloads

Figure 2 shows the year-wise citations and the year-wise downloads (for privacy reasons, the download numbers are not specified) for an article (9) published in Information and Management in 2004. Since the article was put online in ScienceDirect on October 14th, 2003, it was already downloaded before the print publication year. Typically, the download numbers peak in the (print) publication year. In the following years, the download volume normally decreases slowly. However, a new increase is possible, for instance, due to the citation impact of an article. To some degree, also the general rise of downloads (users) in ScienceDirect might have some effect. In contrast, citations are low in the year of publication and reach their maximum several years later.

Figure 2 - Year-wise downloads and citations for the article by Amoako-Gyampah and Salam (2004) (9)

To give a more general picture, we show the year-wise downloads for all full-length articles published in Information and Management from 2002 and 2011 in Table 1. For privacy reasons, we only give relational numbers. As a matter of fact, the download numbers are one “magnitude” higher than the citation counts. As can be seen, the download maximum (formatted in bold) always (besides 2002) occurs in the (print) publication year. However, for older volumes (publication years: 2002 - 2005) a re-increase in the downloads can be observed in the years 2008 and 2009 after a decline in the previous years. Table 2 displays the year-wise citations for the corresponding document types in Scopus (article, proceedings paper, and review) and confirms what was already mentioned above.

Table 1 - Year-wise relation of downloads per print publication year (2002-2011), document type: full-length article - FLA (n=581)

 

Table 2 - Year-wise citations per publication year (2002-2011), document types: article, review, conference paper (n=533, only cited documents)

User analysis of Mendeley readers

Mendeley enables their users to create and maintain user profiles that include, among other information, their professional status. This makes it possible to conduct an analysis of the user structure of Mendeley “readers”. As can be seen in Figure 3, more than two thirds of the readers of the analyzed journal are students (most of them PhD and master students). Professors, associate professors and assistant professors, who might have a considerably higher proportion in the Scopus publications, account for only 15 % of Mendeley users. These results are in line with those found when investigating another Information Systems journal (10).

Figure 3 - Readership structure of the articles in Mendeley (2002-2011) (data extraction: October 2012)

 

Conclusions

In our analysis we identified a high (though not a perfect) correlation between citations and downloads which was slightly lower between downloads and readership frequencies and again between citations and readership counts. This is mainly due to the fact that the used data (sources) are related either to research or at least to teaching in higher education institutions. In the research process, papers are downloaded (for instance, from ScienceDirect) and, more or less frequently, their bibliographic data are entered into a reference management system (for instance, Mendeley). Later on, the very same papers may be cited by an article which, when accepted in a journal covered by a citation index such as Scopus, will increase their citation impact. Though being used in a similar “context”, the three data sources have several differences. They concern, among others, the contents and the user population.

The Mendeley catalogue with its 110 million unique documents is the largest data source among the three. It includes articles not only from journals (also from journals not included in Scopus) but also grey literature, proceedings articles and monographs. Since an article must be entered by at least one user in Mendeley, not all of the journal articles from Scopus are necessarily covered by Mendeley. In particular, coverage varies between disciplines (11). ScienceDirect is a full-text service, providing a subset of Scopus articles (see Figure 4). All three are owned by Reed Elsevier.

Figure 4 - Coverage of ScienceDirect, Scopus and Mendeley (size of the ovals does not represent the real relations in size among the data sources; the rectangle represents the articles from the analyzed journal Information and Management)

Since the analyzed journal was almost fully covered by the three data sources (more than 95 per cent of ScienceDirect’s full-length articles published between 2002 and 2011 were covered by Mendeley in October 2012), one of the strongest remaining influencing factors onto the relation between citations, downloads and readership frequencies might be their user structure (see Figure 5).

 

Figure 5 - Size of user communities of ScienceDirect (downloading users), Scopus (publishing and citing authors) and Mendeley (readers) (size of the ovals does not represent the real relations in size among the user numbers)

As was reported before, two thirds of the Mendeley users are students. Contrary to bachelor and master students (approximately 25 per cent of all Mendeley users), PhD and doctoral students are often also engaged in publication activities in particular in the Natural Sciences. Nevertheless, senior researchers might have the highest publication output in Scopus. ScienceDirect might have the broadest user base covering also users who are not actively involved in scholarly publishing (for instance, university teachers). Due to the different user structure the motives for downloading, reading and citing articles will be different too. Therefore, a perfect relation between the three indicators cannot be expected.

 

Acknowledgement

This report is based in part on the analysis of anonymous ScienceDirect usage data and Scopus citation data provided by Elsevier within the framework of the Elsevier Bibliometric Research Program (EBRP). Readership data were provided by Mendeley. The authors would like to thank both Elsevier and Mendeley for their great support and the reviewers from Research Trends for their useful comments.

 

References

(1) Taylor, M. (2013) “The Challenges of Measuring Social Impact Using Altmetrics”, Research Trends, Issue 33, June 2013, 11-15.
(2) www.plumanalytics.com/metrics.html, accessed March 4th, 2014
(3) ebrp.elsevier.com, accessed March 4th, 2014.
(4) www.sciencedirect.com, accessed March 4th, 2014.
(5) www.scopus.com, accessed March 4th, 2014.
(6) www.mendeley.com, accessed March 4th, 2014.
(7) Scopus and ScienceDirect do not only use different record identifiers but also different document types. The Scopus document types “article”, “proceedings paper” and “review” correspond mainly to the ScienceDirect document type “full-length article”.
(8) Schlögl, C., Gorraiz, J., Gumpenberger, C., Jack, K., Kraker, P. (2013) “Download vs. citation vs. readership data: The case of an information systems journal”, “Proceedings of the 14th International Society of Scientometrics and Informetrics Conference”, AIT Austrian Institute of Technology GmbH, Vienna, pp. 626-634.
(9) Amoako-Gyampah, K., Salam, A.F. (2004) “An extension of the technology acceptance model in an ERP implementation environment”, Information and Management, Vol. 41, No. 6, pp. 731-745. The year-wise citation data were retrieved from Citation Overview Function of Scopus database on March 4th, 2014.
(10) Gorraiz, J., Gumpenberger, C., Jack, K., Kraker, P, Schlögl, C. (2013) “What do citations, downloads and readership data of an information systems journal have in common and where do they differ?”. In: Hinze, S., Lottmann, A. (Eds.) “Translational twists and turns: Science as a socio-economic endeavor”, Proceedings of the 18th International Conference of Science and Technology Indicators (STI 2013), ENID and iFQ, Berlin, pp. 140-145.
(11) Kraker, P., Körner, C., Jack, K. & Granitzer, M. (2012) “Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization”, Proceedings of the 21st International Conference Companion on World Wide Web, ACM, Lyon, pp. 1017–1024.
 
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Introduction

In the past, citations were the prime source for measuring scholarly impact. With the advent of altmetrics, it is possible to detect the use and consumption of scholarly publishing on a much broader basis (1). According to Plum Analytics, besides citations, metrics can be provided on the basis of usage, captures, mentions, and social media (2). In this contribution we will elaborate on the similarities and differences between one example from each of the first three metrics types mentioned above: citations from Scopus; downloads from ScienceDirect; and readership counts from Mendeley. As a use case, we chose the Information Systems journal Information and Management, including all issues from 2002 to 2011.

Information and Management is one of the leading Information Systems journals. It usually publishes eight issues per year and has a geographical focus on Anglo-American and South East Asian countries with regard to authorship and associate editors. From the nearly 600 research articles in the period of analysis, half were published by authors from the U.S. and approximately one third by authors from Taiwan, China, South Korea and Singapore.

Citations and downloads were provided by Elsevier in the framework of the Elsevier Bibliometric Research Program (EBRP) (3). For the publications of the analyzed Information Systems journal all monthly downloads were made available from ScienceDirect (4) and all monthly citations from Scopus (5). Furthermore, we received the readership counts from Mendeley (6). Mendeley is a social reference management system which helps users with the organization of their personal research libraries. The articles, provided by users around the world, are crowd-sourced into a single collection called the Mendeley research catalogue. This makes it possible to calculate the readership frequencies of an article which indicates how many Mendeley users have added it to their personal research library. At the time of writing, this catalogue contains more than 110 million unique articles, crowd-sourced from over 2.5 million users, making it an interesting source of data for large scale network analysis.

 

Relation between citations, downloads and readership counts

Figure 1 shows the relationship between downloads, citations and readership frequencies for all full-length articles (7) published between 2002 and 2011. Data were provided mid 2012 for citations and downloads and in October 2012 for readership data. As can be seen, articles that are downloaded more often are in general cited more frequently. Furthermore, the more frequently an article can be found in Mendeley user libraries (number of readers), the more often it is usually downloaded and cited.

Figure 1 - Downloads vs. cites vs. readers (publication year: 2002-2011, doc type: full-length article)

This is also reflected through the rank correlations (Spearman) among these three indicators, which are 0.76between citations and downloads, 0.66between downloads and readership counts, and 0.59 between citations and readership counts. Similar correlations were computed for another Information Systems journal (Journal of Strategic Information Systems) (8). The fact that there is a strong but not a perfect correlation between these three indicators gives a first indication that they measure partly different aspects of scholarly communication. Therefore, we will look deeper into each measure. In a first step, we will investigate possible differences in obsolescence characteristics. Since Mendeley started only in 2009 and had a high growth in its user base since then, we will perform the obsolescence analysis only for citations and downloads.

 

Obsolescence characteristics of citations and downloads

Figure 2 shows the year-wise citations and the year-wise downloads (for privacy reasons, the download numbers are not specified) for an article (9) published in Information and Management in 2004. Since the article was put online in ScienceDirect on October 14th, 2003, it was already downloaded before the print publication year. Typically, the download numbers peak in the (print) publication year. In the following years, the download volume normally decreases slowly. However, a new increase is possible, for instance, due to the citation impact of an article. To some degree, also the general rise of downloads (users) in ScienceDirect might have some effect. In contrast, citations are low in the year of publication and reach their maximum several years later.

Figure 2 - Year-wise downloads and citations for the article by Amoako-Gyampah and Salam (2004) (9)

To give a more general picture, we show the year-wise downloads for all full-length articles published in Information and Management from 2002 and 2011 in Table 1. For privacy reasons, we only give relational numbers. As a matter of fact, the download numbers are one “magnitude” higher than the citation counts. As can be seen, the download maximum (formatted in bold) always (besides 2002) occurs in the (print) publication year. However, for older volumes (publication years: 2002 - 2005) a re-increase in the downloads can be observed in the years 2008 and 2009 after a decline in the previous years. Table 2 displays the year-wise citations for the corresponding document types in Scopus (article, proceedings paper, and review) and confirms what was already mentioned above.

Table 1 - Year-wise relation of downloads per print publication year (2002-2011), document type: full-length article - FLA (n=581)

 

Table 2 - Year-wise citations per publication year (2002-2011), document types: article, review, conference paper (n=533, only cited documents)

User analysis of Mendeley readers

Mendeley enables their users to create and maintain user profiles that include, among other information, their professional status. This makes it possible to conduct an analysis of the user structure of Mendeley “readers”. As can be seen in Figure 3, more than two thirds of the readers of the analyzed journal are students (most of them PhD and master students). Professors, associate professors and assistant professors, who might have a considerably higher proportion in the Scopus publications, account for only 15 % of Mendeley users. These results are in line with those found when investigating another Information Systems journal (10).

Figure 3 - Readership structure of the articles in Mendeley (2002-2011) (data extraction: October 2012)

 

Conclusions

In our analysis we identified a high (though not a perfect) correlation between citations and downloads which was slightly lower between downloads and readership frequencies and again between citations and readership counts. This is mainly due to the fact that the used data (sources) are related either to research or at least to teaching in higher education institutions. In the research process, papers are downloaded (for instance, from ScienceDirect) and, more or less frequently, their bibliographic data are entered into a reference management system (for instance, Mendeley). Later on, the very same papers may be cited by an article which, when accepted in a journal covered by a citation index such as Scopus, will increase their citation impact. Though being used in a similar “context”, the three data sources have several differences. They concern, among others, the contents and the user population.

The Mendeley catalogue with its 110 million unique documents is the largest data source among the three. It includes articles not only from journals (also from journals not included in Scopus) but also grey literature, proceedings articles and monographs. Since an article must be entered by at least one user in Mendeley, not all of the journal articles from Scopus are necessarily covered by Mendeley. In particular, coverage varies between disciplines (11). ScienceDirect is a full-text service, providing a subset of Scopus articles (see Figure 4). All three are owned by Reed Elsevier.

Figure 4 - Coverage of ScienceDirect, Scopus and Mendeley (size of the ovals does not represent the real relations in size among the data sources; the rectangle represents the articles from the analyzed journal Information and Management)

Since the analyzed journal was almost fully covered by the three data sources (more than 95 per cent of ScienceDirect’s full-length articles published between 2002 and 2011 were covered by Mendeley in October 2012), one of the strongest remaining influencing factors onto the relation between citations, downloads and readership frequencies might be their user structure (see Figure 5).

 

Figure 5 - Size of user communities of ScienceDirect (downloading users), Scopus (publishing and citing authors) and Mendeley (readers) (size of the ovals does not represent the real relations in size among the user numbers)

As was reported before, two thirds of the Mendeley users are students. Contrary to bachelor and master students (approximately 25 per cent of all Mendeley users), PhD and doctoral students are often also engaged in publication activities in particular in the Natural Sciences. Nevertheless, senior researchers might have the highest publication output in Scopus. ScienceDirect might have the broadest user base covering also users who are not actively involved in scholarly publishing (for instance, university teachers). Due to the different user structure the motives for downloading, reading and citing articles will be different too. Therefore, a perfect relation between the three indicators cannot be expected.

 

Acknowledgement

This report is based in part on the analysis of anonymous ScienceDirect usage data and Scopus citation data provided by Elsevier within the framework of the Elsevier Bibliometric Research Program (EBRP). Readership data were provided by Mendeley. The authors would like to thank both Elsevier and Mendeley for their great support and the reviewers from Research Trends for their useful comments.

 

References

(1) Taylor, M. (2013) “The Challenges of Measuring Social Impact Using Altmetrics”, Research Trends, Issue 33, June 2013, 11-15.
(2) www.plumanalytics.com/metrics.html, accessed March 4th, 2014
(3) ebrp.elsevier.com, accessed March 4th, 2014.
(4) www.sciencedirect.com, accessed March 4th, 2014.
(5) www.scopus.com, accessed March 4th, 2014.
(6) www.mendeley.com, accessed March 4th, 2014.
(7) Scopus and ScienceDirect do not only use different record identifiers but also different document types. The Scopus document types “article”, “proceedings paper” and “review” correspond mainly to the ScienceDirect document type “full-length article”.
(8) Schlögl, C., Gorraiz, J., Gumpenberger, C., Jack, K., Kraker, P. (2013) “Download vs. citation vs. readership data: The case of an information systems journal”, “Proceedings of the 14th International Society of Scientometrics and Informetrics Conference”, AIT Austrian Institute of Technology GmbH, Vienna, pp. 626-634.
(9) Amoako-Gyampah, K., Salam, A.F. (2004) “An extension of the technology acceptance model in an ERP implementation environment”, Information and Management, Vol. 41, No. 6, pp. 731-745. The year-wise citation data were retrieved from Citation Overview Function of Scopus database on March 4th, 2014.
(10) Gorraiz, J., Gumpenberger, C., Jack, K., Kraker, P, Schlögl, C. (2013) “What do citations, downloads and readership data of an information systems journal have in common and where do they differ?”. In: Hinze, S., Lottmann, A. (Eds.) “Translational twists and turns: Science as a socio-economic endeavor”, Proceedings of the 18th International Conference of Science and Technology Indicators (STI 2013), ENID and iFQ, Berlin, pp. 140-145.
(11) Kraker, P., Körner, C., Jack, K. & Granitzer, M. (2012) “Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization”, Proceedings of the 21st International Conference Companion on World Wide Web, ACM, Lyon, pp. 1017–1024.
 
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Scholarly blogs are a promising altmetric source

Can blog posts be used as an altmetric source? Hadas Shema, Judit Bar-Ilan and Mike Thelwall propose that they can, but also discuss the obstacles related to this idea.

Read more >


 

“…Russel Lyons who posits that Christakis' and Fowler's work is a great example of statistical illiteracy, and that the conclusion drawn from their data, that obesity is socially contagious, is severely flawed and can't be made.”

Blogger Yoni Freedhoff, MD, in his blog “Weighty Matters(1)

 

Scholarly blogs are one of the most prominent information sources for altmetrics and are reported in the main altmetric services (e.g., ImpactStory.org, altmetric.com). National Geographic, the Nature Group, Scientific American, and the PLOS (Public Library of Science) journals all have science blogging networks. Scholarly blogs have been defined as “blogs written by academic experts that are dedicated in large part to scientific content” (p. 171) (2). This definition is rather vague, because of the difficulty defining an “expert” and “scholarly content.” A scholarly blog can also be defined by its platform (e.g., SciLogs, Scienceblogs.com), by the media outlet hosting it (e.g., Scientific American, The Guardian), by the affiliation or education of its blogger(s), by its contents, discipline, links to other blogs (blog roll) or any combination of the above. A blog by a single graduate student posting about subjects related to her research can be as ‘scholarly’ as a blog by several experienced researchers posting across disciplines.

A number of studies have implicated blog coverage as an indicator of scholarly impact. The commercial firm altmetric.com’s data from July 2011 up to January 1st, 2013 has been used to study the association of potential altmetric sources (Twitter, Facebook wall posts, research highlights, blogs, mainstream media, forums, Google+, LinkedIn, Pinterest, question and answer sites, and Reddit) and Web of Science (WoS) citations (3). The blogs in the sample came from the Nature.com blogging network and the blogging aggregators ResearchBlogging (RB) and ScienceSeeker. The study compared the number of times an article was covered in blogs (they calculated each altmetric source separately), with two articles that have also received a mention in an altmetric source (not necessarily a blog), one published shortly before the article in question and the second published shortly after. The authors concluded that “In summary, there is clear evidence that three altmetrics (tweets, FbWalls, blogs) tend to associate with citations at the level of individual journals” (p. 4).

Another study of altmetric.com data (4) looked at altmetric mentions of articles (with DOIs - Digital Object Identifiers) in various metric sources (reference managers excluded) from July 2011 to mid-October 2013, and correlated them with articles indexed in WoS and the citations they accumulated in 2012 (in part of the analysis the corpus was used in full, in another part only the July-December 2011 data, to allow a full year of citations). They found a relatively strong correlation between blog and news outlet mentions and citations. A factor analysis found that blog and news outlet mentions belonged to one dimension, while other altmetrics (Twitter, Google+ and Facebook walls) belonged to another. This aligns with Taylor and Plume (5), who studied altmetric.com data from the last four months of 2013. Taylor and Plume classified the altmetric data sources into four categories: social activity (e.g., Facebook, Twitter), scholarly activity (e.g., bookmarking on Mendeley), scholarly commentary (e.g., blogs, F1000Prime) and mass media coverage. They found that between the top 0.5% of articles in each category, the highest chance of overlap was between mass media coverage and scholarly commentary.

A small-scale study (6) looked at the effect of blog post coverage on 16 clinical pain PLOS ONE articles. The blog posts were published in the blog BodyinMind.org, which had at that time over 2,500 unique views per week, and were disseminated by social media (RB, Twitter, Facebook, LinkedIn). In the week after the blog post coverage of each article, there were on average about 3 additional downloads of the article per day and 12 additional HTML views. The authors did not find a correlation between Scopus citations a year after the blog post publications and social media metrics or HTML views, but did find a moderate correlation between PDF downloads and citations.

 

The structured blog citation

Scholarly bloggers often comment on material from peer-reviewed journals, but unlike authors of peer-reviewed articles, they are not obligated to reference their sources in a formal way. Despite this, scientific bloggers have mentioned in interviews that they would have liked to use references in a similar way to the way that they cite in scholarly articles (7).

The aggregator ResearchBlogging.org (RB) was built to answer this need. Launched in late 2008, it aggregates blog posts referring specifically to peer-reviewed research. It is a self-selecting aggregator that allows bloggers to refer to peer-reviewed research in an academic citation format. Bloggers discussing peer-reviewed research can register with the aggregator and after they mark relevant posts in their blog, these posts appear on the aggregator site, giving one-stop access to a variety of research reviews from different authors. The site's human editors ensure that blogs submitted to the aggregator follow its guidelines and are of appropriate quality. RB already has an altmetric role; it currently serves as one of the article level metrics (ALM) displayed for each article in the journal PLOS ONE (8). By the end of 2011, RB had more than 1,230 active blogs and about 27,000 posts (9). These posts seem to be a transitional phase between traditional scholarly discourse and rapid, informal blog writing - a scientometric Archaeopteryx.

The first study of RB, which looked at its Chemistry category, found that most blog posts were about current research and came from high-impact niche journals as well as prestigious multidisciplinary journals (10). Similar results were also found in subsequent studies (9), (11), (12) for other RB categories. Bloggers prefer to cover articles from top multidisciplinary journals, the most popular being (in alphabetical order) Nature, PLOS ONE, Proceedings of the National Academy of Sciences of the United States of America (PNAS) and Science. Most of the posts aggregated in RB are written in English. The bloggers classify their posts into pre-defined categories, the most popular categories being Biology, Health Sciences, Neuroscience and Psychology (9), (11).

RB (see Figure 1) was the data source for our blog study of the association between blog coverage and traditional citations (13). We took a different approach than (3) and (4), not taking into account the number of times an article was covered in blogs, but only whether it was covered or not. We compared journal articles from 2009 and 2010 which were covered in blog posts from the same year (i.e., a 2009 article covered in a 2009 post, a 2010 article covered in a 2010 post) with the general population of articles from the same journal in the same year, to see if these articles received a higher number of citations in the years after their publication in comparison to articles from the same journal and year not covered in blogs. In 2009 a total of 58% (7 out of 12 journals), and in 2010 a total of 68% (13 out of 19 journals) of journals published articles covered by blogs that attracted more citations than articles from the same journal and year that were not covered by blogs. The most striking difference in medians was between articles covered by the New England Journal of Medicine (NEJM) in 2009 (172) and NEJM articles from 2009 which were not covered in blogs (76). We also found an association between coverage of the NEJM articles in blogs and their coverage in the Reuters and New York Times websites. Twenty-one out of the 26 NEJM articles in our 2009 sample (81%) and 20 out of 38 (53%) NEJM sample articles in 2010 were covered by Reuters, the New York Times, or both. This aligns with the findings of (5) as well as those presented in (4). News coverage has been known to correlate with a higher level of citations (14), and it is a possibility that the higher level of citations that many articles covered in blogs enjoy reflects the bloggers’ tendency to choose articles covered by mass media. We cannot tell if this tendency comes from the direct influence of mass media coverage on its scholarly blogger consumers, or if the bloggers’ tastes simply align with those of the mass media.

Hadas_figure1 RT

Figure 1 - a typical RB post snippet.

 

In conclusion

There is evidence that blog coverage of scholarly articles associates with increased visibility and impact. Unfortunately, there are a number of obstacles that might limit the use of blog posts as an altmetric source. First, only a small percentage of articles is covered in blogs (e.g., 1.9% of the articles studied in (4)).

Second, the definition of “scholarly blogs” and the decision about which blog data to use is problematic. When relying on certain aggregators or networks for blog data we miss the impact of articles covered by blogs outside the data collection range. The coverage problem is not specific to blogs, or even to altmetrics, but extends to bibliometric databases, which also have to choose which sources to index.

Third, there is a lack of sustainability. While most peer-reviewed journals enjoy professional archiving and printed copies, blogs can close down or move without leaving a trace (except perhaps in archive.org and similar sites). For blog-derived data to be reliable, they have to be better indexed and archived.

If peer-reviewed journals citations are “frozen footprints,” (15, abstract) then citations in blogs, and altmetrics in general, are footprints in quicksand. In spite of these limitations, we consider blogs to be an especially promising altmetric source.

The effort required to write a blog post (assuming it isn’t spam or computer-generated) is much greater than the effort needed to tweet, “like” or bookmark an article. Scholarly blogging at its best can be a type of post-publication peer-review, scholarly commentary or citizen journalism and its presence can be used as an impact indicator.

 

References

(1) Freedhoff, Y. (2011, June 20) “Obesity is contagious, or is it? A sober second look at obesity and social networks.” [BlogPost]. Retrieved from http://www.weightymatters.ca/2011/06/obesitys-contagious-or-is-it-sober.html
 
(2) Puschmann, C., Mahrt, M. (2012) “Scholarly blogging: A new form of publishing or science journalism 2.0?” In: A. Tokar, M. Beurskens, S. Keuneke, M. Mahrt, I. Peters, C. Puschmann, K. Weller, & T. van Treeck (Eds.), Science and the Internet (pp. 171-182). Düsseldorf: Düsseldorf University Press.
 
(3) Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other social Web services”, PLOS ONE, Vol.8, No. 5, e64841.
 
(4) Costas, R., Zahedi, Z. & Wouters, P. (2014) “Do altmetrics correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective”. ArXiv preprint arXiv:1401.4321.
 
(5) Taylor, M. & Plume, A. (2014) “Party papers or policy discussions: an examination of highly shared papers using altmetric data”, Research Trends, Issue 36, 17-20.
 
(6) Allen, H.G., Stanton, T.R., Di Pietro, F. & Moseley, G.L. (2013) “Social media release increases dissemination of original articles in the clinical pain sciences”, PLOS ONE, Vol. 8, No. 7, e68914.
 
(7) Kjellberg, S. (2010) “I am a blogging researcher: Motivations for blogging in a scholarly context” First Monday, Vol. 15, No. 8.
 
(8) PLOS ONE. (n.d.) “Article-level metrics information”, Retrieved from http://www.plosone.org/static/almInfo
 
(9) Fausto, S., Machado, F.A., Bento, L.F., Iamarino, A., Nahas, T.R. & Munger, D.S. (2012) “Research blogging: Indexing and registering the change in science 2.0”, PLOS ONE, Vol.7, No. 12, e50109.
 
(10) Groth, P. & Gurney, T. (2010) “Studying scientific discourse on the Web using bibliometrics: A chemistry blogging case study”. Presented at the WebSci10: Extending the Frontiers of Society On-Line, Raleigh, NC. Retrieved from http://wiki.few.vu.nl/sms/images/9/9c/Websci10-FINAL-29-4-2010f.pdf
 
(11) Shema, H., Bar-Ilan, J. & Thelwall, M. (2012) “Research blogs and the discussion of scholarly information”, PLOS ONE, 7(5), e35869.
 
(12) Shema, Bar-Ilan & Thelwall (in press) “How is research blogged? A content analysis approach”, Journal of the Association for Information Science and Technology.
 
(13) Shema, H., Bar-Ilan, J. and Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics.” Journal of the Association for Information Science and Technology. Advanced online publication doi: 10.1002/asi.23037
 
(14) Phillips, D.P., Kanter, E.J., Bednarczyk, B. & Tastad, P.L. (1991) “Importance of the lay press in the transmission of medical knowledge to the scientific community”, New England Journal of Medicine, Vol. 325, No. 16, pp. 1180–1183.
 
(15) Cronin, B. (1981) “The need for a theory of citing”, Journal of Documentation, Vol. 37, No.1, pp. 16–24.
 
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

 

“…Russel Lyons who posits that Christakis' and Fowler's work is a great example of statistical illiteracy, and that the conclusion drawn from their data, that obesity is socially contagious, is severely flawed and can't be made.”

Blogger Yoni Freedhoff, MD, in his blog “Weighty Matters(1)

 

Scholarly blogs are one of the most prominent information sources for altmetrics and are reported in the main altmetric services (e.g., ImpactStory.org, altmetric.com). National Geographic, the Nature Group, Scientific American, and the PLOS (Public Library of Science) journals all have science blogging networks. Scholarly blogs have been defined as “blogs written by academic experts that are dedicated in large part to scientific content” (p. 171) (2). This definition is rather vague, because of the difficulty defining an “expert” and “scholarly content.” A scholarly blog can also be defined by its platform (e.g., SciLogs, Scienceblogs.com), by the media outlet hosting it (e.g., Scientific American, The Guardian), by the affiliation or education of its blogger(s), by its contents, discipline, links to other blogs (blog roll) or any combination of the above. A blog by a single graduate student posting about subjects related to her research can be as ‘scholarly’ as a blog by several experienced researchers posting across disciplines.

A number of studies have implicated blog coverage as an indicator of scholarly impact. The commercial firm altmetric.com’s data from July 2011 up to January 1st, 2013 has been used to study the association of potential altmetric sources (Twitter, Facebook wall posts, research highlights, blogs, mainstream media, forums, Google+, LinkedIn, Pinterest, question and answer sites, and Reddit) and Web of Science (WoS) citations (3). The blogs in the sample came from the Nature.com blogging network and the blogging aggregators ResearchBlogging (RB) and ScienceSeeker. The study compared the number of times an article was covered in blogs (they calculated each altmetric source separately), with two articles that have also received a mention in an altmetric source (not necessarily a blog), one published shortly before the article in question and the second published shortly after. The authors concluded that “In summary, there is clear evidence that three altmetrics (tweets, FbWalls, blogs) tend to associate with citations at the level of individual journals” (p. 4).

Another study of altmetric.com data (4) looked at altmetric mentions of articles (with DOIs - Digital Object Identifiers) in various metric sources (reference managers excluded) from July 2011 to mid-October 2013, and correlated them with articles indexed in WoS and the citations they accumulated in 2012 (in part of the analysis the corpus was used in full, in another part only the July-December 2011 data, to allow a full year of citations). They found a relatively strong correlation between blog and news outlet mentions and citations. A factor analysis found that blog and news outlet mentions belonged to one dimension, while other altmetrics (Twitter, Google+ and Facebook walls) belonged to another. This aligns with Taylor and Plume (5), who studied altmetric.com data from the last four months of 2013. Taylor and Plume classified the altmetric data sources into four categories: social activity (e.g., Facebook, Twitter), scholarly activity (e.g., bookmarking on Mendeley), scholarly commentary (e.g., blogs, F1000Prime) and mass media coverage. They found that between the top 0.5% of articles in each category, the highest chance of overlap was between mass media coverage and scholarly commentary.

A small-scale study (6) looked at the effect of blog post coverage on 16 clinical pain PLOS ONE articles. The blog posts were published in the blog BodyinMind.org, which had at that time over 2,500 unique views per week, and were disseminated by social media (RB, Twitter, Facebook, LinkedIn). In the week after the blog post coverage of each article, there were on average about 3 additional downloads of the article per day and 12 additional HTML views. The authors did not find a correlation between Scopus citations a year after the blog post publications and social media metrics or HTML views, but did find a moderate correlation between PDF downloads and citations.

 

The structured blog citation

Scholarly bloggers often comment on material from peer-reviewed journals, but unlike authors of peer-reviewed articles, they are not obligated to reference their sources in a formal way. Despite this, scientific bloggers have mentioned in interviews that they would have liked to use references in a similar way to the way that they cite in scholarly articles (7).

The aggregator ResearchBlogging.org (RB) was built to answer this need. Launched in late 2008, it aggregates blog posts referring specifically to peer-reviewed research. It is a self-selecting aggregator that allows bloggers to refer to peer-reviewed research in an academic citation format. Bloggers discussing peer-reviewed research can register with the aggregator and after they mark relevant posts in their blog, these posts appear on the aggregator site, giving one-stop access to a variety of research reviews from different authors. The site's human editors ensure that blogs submitted to the aggregator follow its guidelines and are of appropriate quality. RB already has an altmetric role; it currently serves as one of the article level metrics (ALM) displayed for each article in the journal PLOS ONE (8). By the end of 2011, RB had more than 1,230 active blogs and about 27,000 posts (9). These posts seem to be a transitional phase between traditional scholarly discourse and rapid, informal blog writing - a scientometric Archaeopteryx.

The first study of RB, which looked at its Chemistry category, found that most blog posts were about current research and came from high-impact niche journals as well as prestigious multidisciplinary journals (10). Similar results were also found in subsequent studies (9), (11), (12) for other RB categories. Bloggers prefer to cover articles from top multidisciplinary journals, the most popular being (in alphabetical order) Nature, PLOS ONE, Proceedings of the National Academy of Sciences of the United States of America (PNAS) and Science. Most of the posts aggregated in RB are written in English. The bloggers classify their posts into pre-defined categories, the most popular categories being Biology, Health Sciences, Neuroscience and Psychology (9), (11).

RB (see Figure 1) was the data source for our blog study of the association between blog coverage and traditional citations (13). We took a different approach than (3) and (4), not taking into account the number of times an article was covered in blogs, but only whether it was covered or not. We compared journal articles from 2009 and 2010 which were covered in blog posts from the same year (i.e., a 2009 article covered in a 2009 post, a 2010 article covered in a 2010 post) with the general population of articles from the same journal in the same year, to see if these articles received a higher number of citations in the years after their publication in comparison to articles from the same journal and year not covered in blogs. In 2009 a total of 58% (7 out of 12 journals), and in 2010 a total of 68% (13 out of 19 journals) of journals published articles covered by blogs that attracted more citations than articles from the same journal and year that were not covered by blogs. The most striking difference in medians was between articles covered by the New England Journal of Medicine (NEJM) in 2009 (172) and NEJM articles from 2009 which were not covered in blogs (76). We also found an association between coverage of the NEJM articles in blogs and their coverage in the Reuters and New York Times websites. Twenty-one out of the 26 NEJM articles in our 2009 sample (81%) and 20 out of 38 (53%) NEJM sample articles in 2010 were covered by Reuters, the New York Times, or both. This aligns with the findings of (5) as well as those presented in (4). News coverage has been known to correlate with a higher level of citations (14), and it is a possibility that the higher level of citations that many articles covered in blogs enjoy reflects the bloggers’ tendency to choose articles covered by mass media. We cannot tell if this tendency comes from the direct influence of mass media coverage on its scholarly blogger consumers, or if the bloggers’ tastes simply align with those of the mass media.

Hadas_figure1 RT

Figure 1 - a typical RB post snippet.

 

In conclusion

There is evidence that blog coverage of scholarly articles associates with increased visibility and impact. Unfortunately, there are a number of obstacles that might limit the use of blog posts as an altmetric source. First, only a small percentage of articles is covered in blogs (e.g., 1.9% of the articles studied in (4)).

Second, the definition of “scholarly blogs” and the decision about which blog data to use is problematic. When relying on certain aggregators or networks for blog data we miss the impact of articles covered by blogs outside the data collection range. The coverage problem is not specific to blogs, or even to altmetrics, but extends to bibliometric databases, which also have to choose which sources to index.

Third, there is a lack of sustainability. While most peer-reviewed journals enjoy professional archiving and printed copies, blogs can close down or move without leaving a trace (except perhaps in archive.org and similar sites). For blog-derived data to be reliable, they have to be better indexed and archived.

If peer-reviewed journals citations are “frozen footprints,” (15, abstract) then citations in blogs, and altmetrics in general, are footprints in quicksand. In spite of these limitations, we consider blogs to be an especially promising altmetric source.

The effort required to write a blog post (assuming it isn’t spam or computer-generated) is much greater than the effort needed to tweet, “like” or bookmark an article. Scholarly blogging at its best can be a type of post-publication peer-review, scholarly commentary or citizen journalism and its presence can be used as an impact indicator.

 

References

(1) Freedhoff, Y. (2011, June 20) “Obesity is contagious, or is it? A sober second look at obesity and social networks.” [BlogPost]. Retrieved from http://www.weightymatters.ca/2011/06/obesitys-contagious-or-is-it-sober.html
 
(2) Puschmann, C., Mahrt, M. (2012) “Scholarly blogging: A new form of publishing or science journalism 2.0?” In: A. Tokar, M. Beurskens, S. Keuneke, M. Mahrt, I. Peters, C. Puschmann, K. Weller, & T. van Treeck (Eds.), Science and the Internet (pp. 171-182). Düsseldorf: Düsseldorf University Press.
 
(3) Thelwall, M., Haustein, S., Larivière, V. & Sugimoto, C. (2013) “Do altmetrics work? Twitter and ten other social Web services”, PLOS ONE, Vol.8, No. 5, e64841.
 
(4) Costas, R., Zahedi, Z. & Wouters, P. (2014) “Do altmetrics correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective”. ArXiv preprint arXiv:1401.4321.
 
(5) Taylor, M. & Plume, A. (2014) “Party papers or policy discussions: an examination of highly shared papers using altmetric data”, Research Trends, Issue 36, 17-20.
 
(6) Allen, H.G., Stanton, T.R., Di Pietro, F. & Moseley, G.L. (2013) “Social media release increases dissemination of original articles in the clinical pain sciences”, PLOS ONE, Vol. 8, No. 7, e68914.
 
(7) Kjellberg, S. (2010) “I am a blogging researcher: Motivations for blogging in a scholarly context” First Monday, Vol. 15, No. 8.
 
(8) PLOS ONE. (n.d.) “Article-level metrics information”, Retrieved from http://www.plosone.org/static/almInfo
 
(9) Fausto, S., Machado, F.A., Bento, L.F., Iamarino, A., Nahas, T.R. & Munger, D.S. (2012) “Research blogging: Indexing and registering the change in science 2.0”, PLOS ONE, Vol.7, No. 12, e50109.
 
(10) Groth, P. & Gurney, T. (2010) “Studying scientific discourse on the Web using bibliometrics: A chemistry blogging case study”. Presented at the WebSci10: Extending the Frontiers of Society On-Line, Raleigh, NC. Retrieved from http://wiki.few.vu.nl/sms/images/9/9c/Websci10-FINAL-29-4-2010f.pdf
 
(11) Shema, H., Bar-Ilan, J. & Thelwall, M. (2012) “Research blogs and the discussion of scholarly information”, PLOS ONE, 7(5), e35869.
 
(12) Shema, Bar-Ilan & Thelwall (in press) “How is research blogged? A content analysis approach”, Journal of the Association for Information Science and Technology.
 
(13) Shema, H., Bar-Ilan, J. and Thelwall, M. (2014) “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics.” Journal of the Association for Information Science and Technology. Advanced online publication doi: 10.1002/asi.23037
 
(14) Phillips, D.P., Kanter, E.J., Bednarczyk, B. & Tastad, P.L. (1991) “Importance of the lay press in the transmission of medical knowledge to the scientific community”, New England Journal of Medicine, Vol. 325, No. 16, pp. 1180–1183.
 
(15) Cronin, B. (1981) “The need for a theory of citing”, Journal of Documentation, Vol. 37, No.1, pp. 16–24.
 
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)