Issue 26 – January 2012


Citography: the visualization of nineteen thousand journals through their recent citations

A citation from a paper published in one journal to a paper published in another establishes a clear link between the journals: it shows that their respective contents are relevant to each other, and suggests a level of similarity between the two. In any given time period a journal tends to contain citations to many […]

Read more >

A citation from a paper published in one journal to a paper published in another establishes a clear link between the journals: it shows that their respective contents are relevant to each other, and suggests a level of similarity between the two. In any given time period a journal tends to contain citations to many other journals, and those it cites the most should be those with which it is most closely related. Across a broad network, citation relationships contain information about which journals are related to others, and so can be used to examine the structure of literature and the links between different fields1–5. In this article I use 5 years of Scopus-indexed citation data to position 19,562 journals in a map of scholarly research, and use the resulting map to look at the position of, and interactions between, subject fields.

Mapping Scopus

The full set of 2006–2010 citations were extracted from a bibliometric version of the Scopus database as a list of relationships between two publications; and while this includes proceedings and other serials besides journals, I refer solely to journals for clarity. (For more information on bibliometric databases, see the last issue of Research Trends6.) The 2006–2010 restriction applies in two ways: not only must the papers citing other papers have been published in this time period, but so too the articles they cite. After excluding journal self-citations, this produced a list of 4,589,565 journal–journal citation relationships, covering 20,213 journals and 27,196,324 citations. (A citation relationship is a count of the citations made from one journal to another within the time period — and so a single citation relationship often represents more than a single citation.)

This full network of data was reduced in order successfully to map it out. Journal–journal citation relationships representing less than 1 percent of the citations made by the citing journal in this set of data were removed, and this resulted in a smaller network containing 19,562 journals (96.8 percent), linked by 377,729 citation relationships (8.2 percent) containing 11,857,165 citations (43.6 percent). These citation relationships were then used to create a network graph, using the Gephi program, in which nodes represent journals, and connecting lines (or edges) the relationships between them.

Gephi is a freeware graphing program7 which comes with a range of layout algorithms; the recently-developed ForceAtlas28 was selected as it can quickly position thousands of nodes, and features many properties to refine the graph layout. As with many layout algorithms, ForceAtlas2 is force-directed, which means that unrelated nodes in the network repel one another, while connected nodes attract one another. In this case, the magnitude of these forces was determined by the proportion of citations given by the citing journal to the cited journal out of citations given to all other journals in the network, in the time period 2006–2010. Given the method of reducing the data, edge weights take a value between 0.01 and 1.00, such that the higher the value, the stronger the force of attraction between the two journals. The two forces at work result in a graph which stabilizes over time, until it has reached equilibrium. Figure 1 shows the results of using ForceAtlas2 to lay out our network of 19,562 journals.


Figure 1A network of 19,562 journals, linked by 377,729 citation relationships containing 11,857,165 citations mapped using Gephi and the ForceAtlas2 layout algorithm. Each journal is a node (circle) in the map, and edges (lines) between these nodes represent citations from one to the other. Node size is proportional to the total number of citations received by that journal in the time period 2006–2010. Data source: Scopus.

In theory, this map has positioned journals so that related journals are close to one another, and unrelated journals are further apart. But how can this be tested? One option is to use an existing subject classification system, and Figure 2 shows the same map colored according to the subject classifications used by Scopus. There are 27 subject areas, and each is given a different color; journal nodes take the color of the subject area to which they are assigned, but only if they are uniquely assigned to a subject area (journals belonging to multiple subject areas remain gray).

Figure 2Each subject area is assigned a color, used to show journals belonging solely to that subject area. Data source: Scopus.

As this labeled map shows, related fields are positioned close to one another; the map can be used to view the position of each subject area in relation to the others — from the health sciences at the left, round the social sciences at the bottom to mathematics at the right, up to physics and chemistry, and round the biological sciences at the top. The most multidisciplinary fields are positioned towards the center of the map, as is clear by the patches of gray journals belonging to multiple fields.

Stuck in the middle with you

Once we have confidence in the layout of the map, we can use it to look at specific subject areas. Figures 3 and 4 show the journals assigned to Environmental Science, and to Physics and Astronomy, respectively. The two subject areas cover a similar area at the right side of the map, stretching almost from the top to the bottom; however, the Physics and Astronomy map shows a much tighter core of journals located at the right edge of the map, while Environmental Science journals do not cluster strongly in any given area. Not only is the subject area multidisciplinary, reaching across the boundaries of other subjects, but the journals within the field are not as closely related to one another as the journals within Physics and Astronomy.

Figure 3 – All journals assigned to Environmental Science. Data source: Scopus.

Figure 4 – All journals assigned to Physics and Astronomy. Data source: Scopus.

Our global map can also be used to look at the crossover between multiple subject areas. Figure 5 again shows Environmental Science journals, and those in Physics and Astronomy, but this time combined with Earth and Planetary Science journals. Each subject area is given a primary color, and secondary and tertiary colors can be used to show the journals assigned to two or all three subject areas.

Figure 5 – Environmental Science journals are colored red; Earth and Planetary Science, yellow; and Physics and Astronomy, blue. Where journals are assigned to two of these subject areas, the secondary colors orange, green and purple are used; where all three, the tertiary color brown. Data source: Scopus.

Using this map to look at the position of the multi-subject journals, we can see that the Earth and Planetary/Environmental journals (in orange) are spread across a much wider area than the Earth and Planetary/Physics and Astronomy journals (in green), which cluster together very tightly. In addition, Earth and Planetary Science journals form a bridge between the other two subject areas.

A map of science formed using the citations indexed by Scopus allows for detailed analysis of not only where a single journal lies in the global map of literature, and the journals to which it is connected, but also the broader subjects that comprise the map, and journal sets that bridge disciplines.


1. Boyack, K.W. et al. (2005). Mapping the backbone of science. Scientometrics Vol. 64, pp. 351–374.
2. Boyack, K.W. et al. (2009). Mapping the structure and evolution of chemistry research. Scientometrics Vol. 79, pp. 45–60.
3. Leydesdorff, L. & Rafols, I. (2009). A global map of science based on the ISI subject categories. Journal of the American Society for Information Science and Technology Vol. 60, pp. 348–362.
4. Leydesdorff, L. et al. (2010). Journal maps on the basis of Scopus Data: a comparison with the Journal Citation Reports of the ISI. Journal of the American Society for Information Science and Technology Vol. 61, pp. 352–369.
5. Leydesdorff, L. & Rafols, I. (in press). Interactive overlays: a new method for generating global journal maps from Web-of-Science data. Journal of Informetrics.
6. Moed, H. F. et al. (2011). Is science in your country declining? Or is your country becoming a scientific super power, and how quickly? Research Trends, Issue 25.
7. Bastian, M. et al. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.
8. Jacomy, M. et al. (2011). ForceAtlas2, a graph layout algorithm for handy network visualization.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

F1000 Journal Rankings: an alternative way to evaluate the scientific impact of scholarly communications

In recent years the bibliometrics world has been booming with new metrics such as the h-index, EigenFactor, SJR, and SNIP. This expansion of the bibliometrics toolkit has been driven by the continued growth of scholarly content, combined with computational advances, growing global requirements for science to be measured and evaluated, and the problems of information […]

Read more >

In recent years the bibliometrics world has been booming with new metrics such as the h-index, EigenFactor, SJR, and SNIP. This expansion of the bibliometrics toolkit has been driven by the continued growth of scholarly content, combined with computational advances, growing global requirements for science to be measured and evaluated, and the problems of information overload and filter failure.

Before bibliometrics became widespread the evaluation of science was mainly performed through peer review. This more traditional approach was given a new breath of life when the Faculty of 1000 (F1000) was launched in 2002 to evaluate the quality of biomedical scientific articles based on the opinion of scientific experts. Initially, the papers were evaluated by 1,000 international Faculty members; now F1000 boasts more than 10,000 evaluators spread across 44 subject-specific Faculties. It is worth noting, however, that not all of the Faculty members are active or active to the same extent: Research Trends randomly checked the F1000 records of 20 members of the Reproductive Endocrinology Faculty, and although 4 have contributed more than 10 reviews, half have contributed only 1 or 2 evaluations yet, and a quarter have not made any recommendation yet. Jane Hunter, Managing Director at F1000, commented:

Jane Hunter

Unsurprisingly, some Faculty Members (FMs) are more active than others and activity levels vary also depending on what other obligations FMs have on a month-by-month basis. Evaluation submission rates drop during congresses and rise immediately after (also unsurprising). Our most productive FMs select and evaluate 10–20 papers a year; our least productive may pick 1 or 2. F1000 has selected and evaluated 91,000 articles to date and these articles have attracted nearly 116,000 evaluations.”

On average, 1,500 new articles are reviewed every month, which according to the F1000 website corresponds to about 2% of all published articles in the biological and medical sciences. This has led to some criticism of the journal rankings derived from the reviews, as overall they are based on very limited coverage of journal content1,2.For instance, according to Phil Davis, independent researcher and frequent blogger at the Scholarly Kitchen: “because of its limited scope of coverage, the real value of F1000 is not what the aggregate data can tell us about individual journals, but in what experts can tell us about individual articles."  Looking at the 2010 provisional journal rankings, only 5 titles had more than 50% of their papers evaluated, less than 8% of journals had more than 10% of their articles reviewed, and more than 85% had less than 5% of their papers evaluated. According to Jane Hunter, however, this is not a problematic issue:

“F1000’s purpose is to select and evaluate only the top papers in biology and medicine, so it follows that a relatively small percentage of papers from most journals will be included in our database. F1000’s whole system is based on selectivity. This doesn’t invalidate our Journal Rankings. Journals that publish relatively few papers judged as ‘top’ by our Faculty will have a lower FFj (F1000 Journal Factor) in our system and journals that publish a lot of top papers will have a higher FFj.”

One of the unique aspects of the system at the time of the launch was that the ratings were not based on bibliometric data at the journal level, but on expert evaluation at level of individual articles. However, in 2011, in what has been labeled “a 180-degree turn”1 F1000 started a new journal ranking system, including global journal rankings as well as rankings by subject area.

How does it compare to citations?

Citations are usually accepted as a measure of intellectual debt, and although there are negative citations the vast majority of citations are neutral or positive. This can be seen as roughly similar to the F1000 system, in which Faculty members can assign papers to one of three positive quality levels: Exceptional, Must Read, and Recommended. (Interestingly, there is no option to submit negative recommendations.)
However, the similarity ends here: while citations are relatively easy to make (scientific papers routinely include dozens of references), reviews are more time-consuming to produce, and are therefore less numerous. Consequently, it can be argued that F1000 reviews have more weight (there are fewer of them) but also more bias (they can only be positive). However, Jane Hunter disagrees that the absence of negative evaluations introduces bias to the system:

“Negative reviews are simply not what we do. F1000 is a guide to what’s best in science, not a thumbs up/thumbs down review service. There are plenty of comprehensive subject-area reviews published by other companies and we don’t think the world needs another one from us. The fact that we only publish positive reviews doesn’t introduce bias into our system — it is our system. Our subscribers rely on us to tell them what they need to read and not what they need to avoid, so we will never publish negative evaluations. That said, we do publish dissents; if one of our FMs disagrees with another’s article selection or with some aspect of an evaluation, he or she can submit a dissenting opinion, which is then published alongside the article’s evaluation/s on our site. And we also allow registered subscribers to comment on evaluations or dissents, so if they have something to add we invite and encourage them to do so.”

How does it work?

The F1000 Article Factor (FFa) can be calculated from one or several reviews, depending how many are available. If there are several recommendations for one article, the FFa is calculated from the highest rating, which bears a value of 10 for Exceptional, 8 for Must Read, and 6 for Recommended. An incremental value is then added for each of the other ratings (3 for Exceptional, 2 for Must Read, 1 for Recommended). Research Trends was unable to find publicly available explanations for this methodology, and found it difficult to understand why these particular weights were chosen for initial and incremental values, but Jane Hunter was happy to explain:

“The values we assigned to our Recommended, Must Read and Exceptional ratings (6, 8 and 10) are arbitrary, but in essence reflect above-average scores on a 1–10 scale. The rationale for our calculation of total FFa for articles evaluated more than once is also arbitrary — and utilitarian — it made sense to us and seems to work.”

This methodology however raises some concerns about the consistency of the FFa metrics – see example in text box. Furthermore, the FFa calculation gives more weight to the first highest rating and less weight to the following ratings, which has implications for the F1000 Journal Factor (FFj) derived from the FFas: more influence is given to articles with one recommendation compared to articles with several evaluations. As a consequence the FFj appears to be sensitive to enthusiastic reviewers rating numerous papers in small journals.1

Jane Hunter acknowledged this fact, but countered:

“This is not related to our weighting in favor of the highest score a paper receives from us or because we bias our system in favor of number of articles selected over number of evaluations (though we do, intentionally). It’s because at the very specialist end of the scale where there are few journals and we have selected relatively few papers, a small number of additional reviews from a single journal can have a disproportionate impact on a journal’s rank […] For future reference, we will be highlighting articles that have a declared competing interest on our main rankings journal pages in an upgrade planned for later this year. One important feature that sets us apart is complete transparency; our subscribers can easily see how each paper in F1000 was judged, by named experts, and review their reasoning. If there is a competing interest, it is clearly stated.“

Consistency issue: let’s look at some examples

Article A with two Exceptional scores would get an FFa of 13 (10 for the first Exceptional score + 3 for the second Exceptional score). Article B with three Must Read scores and one Recommended score would also get an FFa of 13 (8 for the first Must Read score, 2 for each of the other two Must Read scores, and 1 for the Recommended score), and so would article C with 8 Recommended scores (6 for the first Recommended score + 1 (×7) for the other Recommended scores).

Article A
Rating Exc Exc   FFa
Score 10 3   13
Article B
Rating MR MR MR Rec   FFa
Score 8 2 2 1   13
Article C
Rating Rec Rec Rec Rec Rec Rec Rec Rec FFa
Score 6 1 1 1 1 1 1 1 13

So all three articles would get the same FFa of 13. Let’s imagine now that each article receives one supplementary review (highlighted in red in below table), with an Exceptional score. This would result in article A getting an FFa of 16 (10 for the first Exceptional score and 6 (2 × 3) for the other two Exceptional scores, article B getting an FFa of 17 (10 for the Exceptional score + 6 (3 × 2) for the three Must Read scores + 1 for the Recommended score), and article C getting an FFa of 18 (10 for the Exceptional score + 8 (8 × 1) for the Recommended scores).

Article A
Rating Exc Exc Exc   FFa
Score 10 3 3   16
Article B
Rating Exc MR MR MR Rec   FFa
Score 10 2 2 2 1   17
Article C
Rating Exc Rec Rec Rec Rec Rec Rec Rec Rec FFa
Score 10 1 1 1 1 1 1 1 1 18

So while all articles initially had the same FFa, adding one same rating to each article causes differences in their ranking.

The FFj is calculated from the individual article ratings for a given journal, normalized according to the proportion of eligible scientific articles reviewed by the Faculty. The formula is as follows:

FFj = log10{(Sum of Article Factors) × (Normalization Factor) + 1} × 10

For each journal, the FFa scores are added to obtain the Sum of Article Factors. This sum is then normalized by the Normalization Factor, which is the percentage of articles evaluated by Faculty members compared to all scholarly articles published in the journal according to PubMed. Most bibliometrics indicators normalize for journal size using the number of articles published, but FFj’s normalization is different: going back to our previous bibliometrics analogy, it is similar to multiplying the Impact Factor numerator by the percentage of cited papers rather than dividing it by the number of scholarly papers. This means that FFj’s normalization does not actually account for journal size, but for journal coverage by F1000. For Jane Hunter, this is not a drawback but a benefit:
“Our normalization factor (number of articles selected by F1000/total number of eligible articles) introduces a variable representing journal coverage — or a journal’s F1000 success rate — into our metric. The multiplier accounts for journal size, but it also rewards journals that have had relatively more articles selected by F1000. This is intentional. We want lots of evaluated papers to have a larger positive per-journal effect than a few very highly regarded ones. We believe publishing a lot of good articles is a more reliable indicator of a journal’s value than its ability to publish the occasional megastar.”
The values produced span over several orders of magnitude, so a log scale is applied, and this number is then multiplied by 10 to increase the readability of the final FFj.

Expert Opinion: Ludo Waltman comments

Research Trends spoke to Doctor Ludo Waltman, Bibliometrics Researcher at the Centre for Science and Technology Study at the University of Leiden, about the FFj’s calculation:

“It seems that the developers of the F1000 system wanted to reduce the effect a single publication can have on the overall score of a journal. I guess this is why incremental recommendations have less weight than the initial recommendation. I understand this objective of avoiding 'outliers', but I think there are better ways to achieve this. For instance, the distinction between the initial recommendation and incremental recommendations could be abandoned, giving equal weight to all recommendations of the same type (e.g., all exceptional recommendations have a value of 10, including the incremental ones). To avoid outliers, the final score obtained by adding together the scores obtained from all recommendations a publication has received could be transformed — for instance, by using a square root or logarithmic function. This would also reduce the effect of a single publication with a lot of recommendations, but it has the advantage that consistency of the measurements is maintained. I also have some doubts about the normalization factor used in the calculation of the journal indicators. For instance, suppose we have two journals that each have 100 publications, and in each 50 publications have a single exceptional recommendation and 50 publications do not have any recommendation. This yields a journal score of (50 × 10) × (50%) = 250 for each of the two journals. (For simplicity, I skip the logarithmic transformation performed at the end of the calculations.) Suppose that the two journals are now merged. We then have a single journal with 200 publications, half of them with a single exceptional recommendation and half of them without recommendations. So the score of the merged journal becomes (100 × 10) × (50%) = 500. In other words, journals can increase their score by merging. This means that what is measured by the F1000 journal indicator is first of all the size of a journal (in terms of its number of publications). To obtain a high score, a journal must not only publish high quality articles (i.e., articles that receive recommendations), but it must also publish a large volume of articles. This is different from almost all citation-based journal indicators, such as Impact Factor, SNIP, and SJR (but not Eigenfactor), and most people probably will not be aware of this size-dependence of the F1000 journal indicator.”

What type of rankings does F1000 compute?

Currently, there are three different journal rankings available:

  1. Current Journal Rankings: computed on the first day of each month, these are the most up-to-date as they include all evaluations over the previous 12 months, regardless of the publication date of the articles. For instance, February 2012 Current Journal Rankings take into account all recommendations made between 1 February 2011 and 30 January 2012.
  2. Provisional Annual Journal Rankings: calculated at the beginning of July, these are based on ratings of articles published in the preceding full calendar year. For instance, 2010 Provisional Annual Journal Rankings take into account evaluations made in 2010 and the first half of 2011 to articles published in 2010; 15 percent of evaluations are received 3 months after an article is published or later: as this adds an extra 3 months for ratings to accumulate, the disadvantage to articles published later in a year is decreased.
  3. Final Annual Journal Rankings: also computed at the beginning of July, these take into account evaluations of articles that were published in the last but one full calendar year, enabling the inclusion of 99 percent of potential evaluations for an article regardless of its publication date within a year. For instance, 2010 Final Annual Journal Rankings take into account evaluations made in 2010, 2011, and the first half of 2012 to articles published in 2010.

How does it compare to traditional bibliometrics indicators?

To see how FFj compares with traditional bibliometrics indicators, Research Trends ran a correlation analysis of 2010 Impact Factors versus 2010 provisional FFj for 768 journals mostly of biomedical scope (see Figure 1), in which the proportion of evaluated papers is denoted by the size of the bubble.

  Figure 1 – comparison of 2010 Impact Factor versus 2010 provisional F1000 Journal Factor. Sources: 2011 Journal Citation Reports (© Thomson Reuters); F1000 2010 journal rankings.

The correlation between the two metrics is rather weak overall (correlation coefficient of 0.54), and unsurprisingly at its weakest where only a small proportion of journal content has been evaluated. Yet this correlation does not systematically increase for journals where a high proportion of content has been reviewed. Some of the most noticeable outliers are also some of the journals with the highest Impact Factors (labeled in Figure 1). The analysis was replicated for EigenFactor (correlation coefficient of 0.55), SJR (correlation coefficient of 0.57), and SNIP (correlation coefficient of 0.51). The results presented similar patterns, indicating that bibliometrics indicators and F1000 journal rankings show a different picture of the research landscape: expert ratings seem to measure an alternative dimension to citations. This may be linked to the skewness of the citation distribution in any given journal.

Jane Hunter was not surprised by the results of the analysis: “We wouldn’t expect F1000’s FFjs to directly correlate with bibliometrics indicators — in fact if they did our rankings would be a lot less interesting […] Our metric is based entirely on positive evaluations of science, paper by paper, by panels of experts who read and select articles based solely on their intrinsic — and subjectively judged — importance. Another basic difference between F1000’s metrics and the Impact Factor is that we exclude reviews […] Because of this, journals like Nature Reviews Drug Discovery […] will rank relatively low on F1000, as will any other journal whose Impact Factor is significantly affected by review articles.”

At article level though, there are more similarities: indeed, Allen et al. found a “strong positive association between expert assessment and impact as measured by number of citations and F1000 rating”. They, however, acknowledged that “despite the significant positive correlations between assessments of importance and citations overall, at the individual paper level the analysis showed that there are exceptions; papers that were highly rated by expert reviewers were not always the most highly cited, and vice versa. Additionally, what was highly rated by one set of expert reviewers may not be so by another set; only three of the six ‘landmark’ papers identified by our expert reviewers are currently recommended on the F1000 databases.”3

Where do we go from here?

Jane Hunter offered some concluding remarks:

“We hope that the F1000 Journal Rankings will offer an alternate way of looking at and evaluating scientific success. The strengths and weaknesses of the various ranking systems may balance each other out and ultimately enable scientists to construct a truer picture of where to publish and what to read […] We know there are many ways in which the data generated by F1000 could be used and viewed. Our Article and Journal Factors represent just one way of crunching the individual article ratings allocated by Faculty Members and interpreting the results. The basic data are completely transparent and available on our site, and we’re happy to consider other approaches. The numbers are the numbers, we think they’re interesting, and we know they have other stories to tell.”

Further analyses are needed to help us understand the reasons behind our findings: in particular, it would be very interesting to see how FFjs relate to the distribution of article ratings for each journal. Doing some preliminary research for the article, Research Trends was actually surprised by the apparent lack of studies on the subject, and would therefore like to open a call for papers to the bibliometrics community: we’d love to see more research on F1000 FFa and/or FFj, in particular about their methodologies, or looking at comparison with other metrics. If you’re up for it and would like to publish in Research Trends, just get in touch!


  1. Davis, P. F1000 Journal Rankings — The Map Is Not the Territory. Scholarly Kitchen blog post
  2. Butler, D. (2011). Experts question rankings of journals. Nature 478, Vol. 20 doi:10.1038/478020a
  3. Allen, L., Jones, C., Dolby, K., Lynn, D. & Walport, M. (2009). Looking for landmarks: the role of expert review and bibliometric analysis in evaluating scientific publication outputs. PLoS ONE 4, e5910. doi:10.1371/journal.pone.0005910.

Links of interest

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Letter to the Editor

Dear Editor, The strongest predictor of a journal’s F1000 score is simply the number of article evaluations submitted by F1000 faculty reviewers. Irrespective of their reviewer scores, the number of article evaluations can explain more than 91% of the variation in FFJs (R2=0.91; R=0.96). In contrast, the Impact Factor of the journal can only explain […]

Read more >

Dear Editor,

The strongest predictor of a journal’s F1000 score is simply the number of article evaluations submitted by F1000 faculty reviewers. Irrespective of their reviewer scores, the number of article evaluations can explain more than 91% of the variation in FFJs (R2=0.91; R=0.96). In contrast, the Impact Factor of the journal can only explain 32% of FFJ variation (R2=0.32; R=0.57).

The rankings of journals based on F1000 scores also reveals a strong bias against larger journals, as well as a bias against journals that have marginal disciplinary overlap with the biosciences.

Larger journals, represented by bigger circles in Figure 1, consistently rank lower than smaller journals receiving the same number of article evaluations. This is most apparent in the “inverted ice-cream cone” shapes in the lower left quadrant of the graph. As I argued previously [1], the method of calculating the F1000 Journal Factor makes it sensitive to enthusiastic reviewers of small journals. This method placed the Journal of Sex and Marital Therapy, which received 12 reviews for its 24 articles in 2010 far above Physical Review Letters, which received just 3 reviews for 3,099 articles.

Phil Davis.
Ithaca NY.


1. PM Davis (2011). F1000 Journal Rankings — The Map Is Not the Territory. The Scholarly Kitchen (Oct 5).

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Research Evaluation Metrics- International and Local Perspectives

On October 27th Bar-Ilan University, Israel, hosted a meeting that brought academic and government representatives together to discuss research evaluation metrics and their importance to national-level scientific funding and planning (see picture). The event — organized by The Department of Information Science at Bar-Ilan University (headed by Professor Judit Bar-Ilan), Professor Bluma Pertiz from the […]

Read more >

On October 27th Bar-Ilan University, Israel, hosted a meeting that brought academic and government representatives together to discuss research evaluation metrics and their importance to national-level scientific funding and planning (see picture). The event — organized by The Department of Information Science at Bar-Ilan University (headed by Professor Judit Bar-Ilan), Professor Bluma Pertiz from the Hebrew University of Jerusalem and Elsevier — featured high-profile speakers from different geographical regions, as well as local representatives from both academia and government, and was moderated by university officials led by Professors Pertiz and Bar-Ilan.

The purpose of this event was to foster open discussion and mutual learning between government officials responsible for shaping and funding the local scientific activities, and the researchers in academia whom they evaluate. To meet these aims, the day was designed to provide international and local perspectives on research evaluation measurements and metrics, and to learn from specific case studies how these methodologies have informed scientific policy and funding in different countries. The meeting focused on three major themes: Theoretical Frameworks and Perspectives; Research Policy on a National Level; and Research Evaluation in Practice.

Top row, left to right: Mr. Alessandro Cascino, Dr. Gali Halevi, Mr. David Mino, Mr. Alberto Zigoni, Mr. Neal Katz, Dr. Henk Moed
Bottom row, left to right: Prof. Bluma Pertiz, Dr. Giobanni Abramo, Dr. Daphne Getz, Prof. Judit Bar-Ilan, Prof. Shlomo Hershkovic, Dr. Henry Small, Dr. Marc Luwel, Dr. Meir Zadok

The first session explored theoretical frameworks and featured Drs Henry Small, Henk Moed and Professor Bar-Ilan, each of whom looked at different ways of using bibliometric data to evaluate scientific output and scientists, study emerging scientific trends and map the evolution of scientific communities. The general conclusion of this session, which was moderated by Professor Pertiz, was that one must first clearly define the objectives and motivations for evaluation and trending studies — only then can one select the appropriate methodology to carry them out. Once the methodology is agreed upon, bibliometric data must be carefully analyzed and scrutinized before any conclusions regarding productivity and output can be made.

The second session, moderated by Professor Moshe Yitzhaki, focused on research evaluation at a national level. Dr. Marc Luwel described how OECD states developed indicators for performance-based funding for basic research in Belgium. Dr. Meir Zadok addressed the history of the development of strict indicators for productivity and impact that are necessary in Israel’s highly competitive scientific research environment. Finally, Dr. Giovanni Abramo reported on the Observatory on Public Research (ORP) system in which national-scale research assessment is based on individual evaluations. This session provided a snapshot of state-level views on the value of scientific output measurements and how appropriate methodologies have been developed to answer local questions and conditions. The task of evaluating research on a state level is not an easy one and certainly not one that a single metric can capture. Although there is always an understandable attempt to have a single and straightforward numeric score that can provide decision makers with a simple way to evaluate research and make funding decisions, the lessons from the different paths taken by different government bodies suggest that a successful process must include high-level decisions on what is being measured, and why; a careful choice of datasets; and rigorous analytics that capture the multifaceted aspects of scientific data.

The third session of the day focused on specific case studies that demonstrated how advanced tools have been used in research evaluation in both academic and government institutions, and was moderated by Professor Benjamin Ehrenberg. In the first part of a joint presentation Mr. Shlomo Herskovic described the national database of R&D statistics and indicators that has been established by Israel’s National Council for Research and Development, primarily in conjunction with the Central Bureau of Statistics and the Neaman Institute for National Policy Research, and discussed its advantages and inherent limitations. In the second part of the presentation Dr. Daphne Getz described work done at the Samuel Neaman Institute in developing an infrastructure of data and knowledge to enable an ongoing analysis of Israeli R&D output, expressed by scientific publications and patents. Mr. Neal Katz demonstrated how research evaluation tools, such as the ones included in Elsevier’s SciVal Suite, are being used to support a variety of strategic initiatives in government and academia. The case studies included work carried out for the UK’s Department for Business Innovation & Skills, the Higher Education Funding Council, Tohoku University (Japan) and the National University of Mexico.

The meeting’s mixture of academic and government perspectives opened up opportunities to evaluate and expand on current research evaluation metrics. While the advanced computation tools that now exist combined with the availability of diverse data types makes the measurement of research output and reaching funding decisions more complex, they nonetheless offer a more rounded and extensive set of practices to be adopted by funding bodies and policy makers. This event captured the fact that research evaluation methodologies are not only dependent on computational power or on datasets, but also must fit with the country’s overall scientific strengths and approach. Future events such as this will take place around the world in 2012 to encourage discussion between government and academia, and open debate on current and future evaluation metrics that will be appropriate for governmental scientific policies and academic capabilities.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

The evolution of brain drain and its measurement: Part I

The origin of the ‘brain drain’ In the years immediately following the end of hostilities in the Second World War, large numbers of highly skilled scientists emigrated from Western Europe to the United States. In the UK, concerns over the ‘loss’ of British researchers began to be raised in the early 1950s, as the weight […]

Read more >

The origin of the ‘brain drain’

In the years immediately following the end of hostilities in the Second World War, large numbers of highly skilled scientists emigrated from Western Europe to the United States. In the UK, concerns over the ‘loss’ of British researchers began to be raised in the early 1950s, as the weight of anecdotal (and limited direct) evidence began to mount. By the early 1960s the issue had become politicized and the Royal Society was tasked with reporting on the nature and extent of the problem. Their report, ‘Emigration of scientists from the United Kingdom’, was published in 1963 and received much media attention, but it was the Evening Standard newspaper that subsequently coined the term that was to encapsulate the concept: ‘brain drain’1.

 Over time, the concept of brain drain has shifted in meaning and complexity, and is now generally understood to describe the shift of researchers from any country (typically less scientifically developed) to any other (typically more scientifically developed). Brain drain, as fits the negative connotations of the term, was usually considered as a win-lose scenario.

New models, new approaches

In recent years, the theoretical framework surrounding scientific mobility and migration has become sufficiently developed to require the coinage of a new term: brain circulation2. According to this concept, nations are not considered as winners or losers but as loci in a dynamic system of human capital flows. Within this system, countries may accrue benefits to their domestic scientific capacity through diaspora effects (where the knowledge, skills and professional networks established by emigrant researchers while abroad are shared with colleagues at home) and return rates (where emigrant researchers return to their home countries after a period of working abroad, bringing with them the experiences they have gained)3. Such benefits are intangible and as such are difficult to quantify.

 Methodologically, studies of brain circulation have traditionally drawn on census or migration data2, surveys of researchers4,5, CV analysis6,7, or a combination of methods8. However, empirical data showed that brain circulation cannot be modeled as a purely random process, since there are barriers of language, politics, culture and so on that may act to encourage or prevent a given researcher from moving to a given country. Another more recent study offered the interesting approach of using job advertisements posted on the website of a well-known science weekly to measure brain circulation, but showed that selection bias in the advert placements ruled out the broad applicability of this method9.

 With the advent of comprehensive and sophisticated online publication databases that are populated with peer-reviewed articles with complete author affiliation (address) data, new possibilities have opened up for wide-ranging studies of brain circulation. The development of a methodological framework using these databases was pioneered by Dr. Grit Laudel, currently at the University of Twente in the Netherlands. In her 2003 article ‘Studying the brain drain: can bibliometric methods help?’10, she presented the first systematic attempt to use authors’ listed addresses in published articles as a proxy for their location, so allowing tracking of their migration patterns over time. This study presented preliminary results demonstrating a net movement of ‘elite’ researchers to the US from the rest of the world (in a single specialty, angiotensin research).

 Using the same approach, Laudel subsequently expanded her study to demonstrate that while elite migration to the US can be found at the level of individual specialties (such as angiotensin research), the proportion of elite researchers in the US remained almost constant in the period 1980–200211. This finding across all subject fields appears to mask great lower-level variability, as Laudel demonstrates by contrasting the net gain over time of elite researchers by the US in angiotensin research with the relatively steady-state, US-centric elite researcher population of the vibrational spectroscopy community. Migration rates are therefore also likely to vary considerably at lower levels of aggregation than an entire country, such as at region, state, city or institution level.

Designing a novel approach to brain circulation mapping

As part of the report ‘International Comparative Performance of the UK Research Base: 2011’, commissioned by the Department for Business, Innovation and Skills (BIS), a fresh way of looking at researcher mobility was sought. In the report, published in October 2011, the Scopus database was used to produce a conceptual map of the stocks and flows of human capital in the UK over the 15-year period 1996–2010 (results detailed in Part II of this article in the next issue).

 In an important departure from previous studies using author affiliation data as a proxy for measuring brain circulation, this work was not confined to authors belonging to an elite or to a single subject or specialty (c.f. Refs 12–13). Instead, the approach presented in the report uses Scopus author profile data to derive a history of an author’s affiliations recorded in their publications and to assign them to mobility classes defined by the type and duration of observed moves. There were several conceptual and methodological issues to be resolved before the map could be built:

1. How can we unambiguously assign articles to their authors?

A longstanding problem in researcher mobility studies has been the unambiguous identification of the individual14, as there are common family names in every language and country, and multiple variants of a given person’s name in the published literature. In order to overcome these problems, Scopus has improved its author-profiling algorithm in order to identify individual researchers precisely. The Scopus Author Identifier gives each author a separate ID and groups together all the documents written by that author, matching alternate spellings and variations of the author’s last name and distinguishing between authors using sophisticated algorithm based on data elements associated with the article (such as affiliation, subject area, co-authors and so on).

2. What is a ‘UK researcher’?

Author nationality is not captured in article or author profiling data, and there are serious methodological difficulties in using cultural indicators (such as family names) as a proxy for nationality of birth15. So for this study, authors were assumed to be from the first country from which they have published, or from the country where they published the majority of their articles, when looking at migratory or transitory mobility respectively (see point 4 below). These criteria may, in individual cases, result in authors being assigned to migratory patterns that may not accurately reflect the real situation, but such errors may be assumed to be evenly distributed across the groups and so the overall pattern remains valid. To define the initial population for study, UK authors were identified as those that had listed a UK affiliation on at least one publication (articles, reviews and conference papers) published across the 18,000 journals included in Scopus during the period 1996–2010. This list included about 1.5 million unique authors.

3. What is an ‘active researcher’?

The 1.5 million UK researchers identified includes a large proportion of authors with relatively few publications (with UK or non-UK affiliations) over the entire 15-year period of analysis. As such, it was assumed that they are not likely to represent career researchers, but individuals who have left the research system. As such, a productivity filter was put in place to restrict to those authors with at least 1 article in the latest 5-year period (2006–2010) and at least 10 articles in the entire 15-year period (1996–2010), or those with fewer than 10 articles in 1996–2010 but more than at least 4 articles in 2006–2010. After applying the productivity filter, a set of 210,923 active UK researchers was defined and formed the basis of the study.

4. How should long- and short-term mobility be defined?

The study of brain circulation is complicated by the difficulties in teasing apart the related phenomena of long-term migration from short-term mobility (such as doctoral research visits, sabbaticals, secondments and so on), which might be deemed a form of collaboration. Defining a time period for a stay abroad over and above which it should be considered a permanent migration (migratory mobility), and below which should be deemed a short-term research visit (transitory mobility), is difficult. Drawing on the definition by Crawford et al.16, stays abroad of 2 years or more were considered migratory and were further subdivided into those where the researcher remained abroad or where they subsequently returned to their original country. Stays abroad of less than 2 years were deemed transitory, and were also further subdivided into those who mostly published under a UK or a non-UK affiliation. Researchers without any apparent mobility based on their published affiliations were treated as a separate group.

5. What indicators were applied to understand the groups better?

To better understand the composition of each group defined on the map, two aggregate indicators were calculated for each to represent, in a relative sense, the publication productivity and seniority of the researchers they contain. Relative Productivity represents a measure of the articles per year since the first appearance of each researcher as an author during the period 1996–2010, relative to all UK researchers in the same period, while Relative Seniority represents years since the first appearance of each researcher as an author during the period 1996–2010, relative to all UK researchers in the same period. Both Relative Productivity and Relative Seniority are calculated for each author’s entire output in the period (i.e., not just those articles listing a UK address).

Part II of this article (to be published in the next issue of Research Trends) will present the brain circulation map of the UK in which these methodological issues have been addressed, and its interpretation.


  1. Balmer, B., Godwin, M. & Gregory, J. (2009). The Royal Society and the ‘brain drain’: natural scientists meet social science. Notes Rec. R. Soc. 63, pp. 339–353.
  2. Johnson, J.M. & Regets, M.C. (1998). International mobility of scientists and engineers to the United States—brain drain or brain circulation? Issue Brief (National Science Foundation), No. [?], pp. 98–316.
  3. Ciumasu, I.M. (2010). Turning brain drain into brain networking. Science and Public Policy, Vol. 37, pp. 135–146.
  4. Marceau, J. et al. (2008). Innovation agents: the inter-country mobility of scientists and the growth of knowledge hubs in Asia. Paper presented to the 25th DRUID conference, Copenhagen, June.
  5. Auriol, L. (2010). Careers of doctorate holders: employment and mobility patterns. OECD Science, Technology and Industry Working Papers doi: 10.1787/5kmh8phxvvf5-en
  6. Dietz, J.S. et al. (2000). Using the curriculum vitae to study the career paths of scientists and engineers: an exploratory assessment. Scientometrics, Vol. 49, pp. 419–442.
  7. Cañibano, C. et al. (2008). Measuring and assessing researcher mobility from CV analysis: the case of the Ramón y Cajal programme in Spain. Research Evaluation, Vol. 17, pp. 17–31.
  8. Fontes, M. (2007). Scientific mobility policies: how Portuguese scientists envisage the return home. Science and Public Policy, Vol. 34, pp. 284–298.
  9. Luwel, M. (2005). Job advertisements as an indicator for mobility of researchers: Naturejobs as a case study. Research Evaluation, Vol. 14, pp. 80–92.
  10. Laudel, G. (2003). Studying the brain drain: can bibliometric methods help?. Scientometrics, Vol. 57, pp. 215–237.
  11. Laudel, G. (2005). Migration currents among the scientific elite. Minerva, Vol. 43, pp. 377–395.
  12. Ioannidis, J.P.A. (2004). Global estimates of high-level brain drain and deficit. The FASEB Journal, Vol. 18, pp. 936–939.
  13. Hunter, R.S. et al. (2009). The elite brain drain. The Economic Journal, Vol. 119, pp. F231–F251.
  14. Qiu, J. (2008) “Scientific publishing: Identity crisis” Nature 451 pp. 766-767.
  15. Jonkers, K. (2009). Emerging ties: factors underlying China’s co-publication patterns with Western European and North American research systems in three molecular life science subfields. Scientometrics Vol. 80, pp. 775–795.
  16. Crawford, E. Shinn, T. and Sörlin, S. (1993) The Nationalization and Denationalization of the Sciences: An Introductory Essay, in Crawford, E. Shinn, T. and Sörlin, S. (eds.), Denationalizing Science (Dordrecht: Kluwer).
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

The Power of Scientific Mapping and Visualization: an interview with Prof. Katy Börner

Katy Börner has more professional titles than most, with many appointments across Indiana University in Bloomington. She is the Victor H. Yngve Professor of Information Science at the School of Library and Information Science; Adjunct Professor at the School of Informatics and Computing; Adjunct Professor at the Department of Statistics in the College of Arts […]

Read more >

Katy Börner has more professional titles than most, with many appointments across Indiana University in Bloomington. She is the Victor H. Yngve Professor of Information Science at the School of Library and Information Science; Adjunct Professor at the School of Informatics and Computing; Adjunct Professor at the Department of Statistics in the College of Arts and Sciences; Core Faculty of Cognitive Science; Research Affiliate of the Center for Complex Networks and Systems Research and Biocomplexity Institute; Member of the Advanced Visualization Laboratory; Leader of the Information Visualization Lab; and Founding Director of the Cyberinfrastructure for Network Science Center.

Professor Börner is also a curator of the Places & Spaces: Mapping Science exhibit currently on display at Northeastern University in Boston, MA (see The exhibit is a collaborative work between Börner and researchers in diverse disciplines including scientometrics, network science, geography, education, and information visualization. Together, they design maps of science which introduce unique visualization tools that capture knowledge and enable deeper understanding of global scientific, environmental and economic trends to name a few.

We spoke with Katy Börner in order to better understand the manner by which she develops the different iterations of the exhibit and to gain some insight into the future of visualization in the era of ‘Big Data’ analysis.

How do you establish the collaborations with the different researchers featured in the different maps?

Each year, a call for maps is issued. This year’s Call for Maps for the 8th Iteration of the Places & Spaces: Mapping Science Exhibit on ‘Science Maps for Kids’ (2012) is at (Note: This call has now closed.) Map makers from many different countries and different areas of science submit individually or in teams. Submissions are carefully reviewed by the advisory board and external reviewers with expertise on the topic of the iteration. In 2012, we will invite children to serve as reviewers — if they cannot understand and make use of a map then this map will not be on display in the exhibit.

How important is the international aspect of the maps?

Today’s science is global and it has to be studied, mapped, understood, and managed globally. We welcome maps in all languages and from all countries and cultures.

Could you talk us through the various iterations of these maps?

The exhibit is a 10-year effort. Each year, 10 new maps are added resulting in 100 maps total in 2014. Iteration themes are as follows:

What are some of the guidelines that were developed for the iterations?

Each iteration compares and contrasts four existing maps (e.g., early cartographic maps in the first iteration on ‘The Power of Maps’) to six maps of science. Each map has to fit the theme of the respective iteration. Submissions are evaluated in terms of two kinds of values: i) Scientific value, which centers on quality of data collection, analysis and communication of results in support of clearly stated objectives, and whether the map represents appropriate and innovative application of existing algorithms and/or development of new approaches; ii) Value for user groups (e.g., kids), which considers the following questions: what major insight does the map provide and why does it matter? Is the map easy to understand? Does it inspire kids to learn more about science and technology?

Could you say a few words about the new iteration?

The 8th iteration of the Mapping Science exhibit is devoted to science maps that kids aged 5–14 can use to gain a more holistic understanding and appreciation of science and technology. Each map should be engaging and fun to peruse yet should have at least one concrete learning objective. Among others, the maps might depict:

  • A concept map telling a science story;
  • Famous adventures, encounters, or discoveries in science history;
  • Zooms in-out of the world of science;
  • Surprising, scary, wonderful, and exciting scientific activities;
  • Timelines of science and technology development and inventions;
  • Exhibit holdings at different science museums (location, subject matter, or both);
  • A map of school science curricula, projects, or science textbook contents;
  • Career trajectories in science;
  • Science maps drawn by kids analogous to Children Map the World

Maps are intended to give children the exciting opportunity to immerse in, explore, or navigate the landscape of science and to find their own place.

Places & Spaces: Mapping Science is a travelling exhibition. Can any interested party purchase their own maps and display them?

Maps are on display at public libraries, science museums, national academies of science, universities, companies and are seen, discussed, admired, and purchased by scholars, practitioners, educators, and others. Maps are available for sale at and proceeds help finance the creation of new iterations and their display at public venues.

How do you think ‘Big Data’ computation will influence the production and usefulness of scientific maps?

The bigger and more complex the datasets, the higher the need for effective data mining and visualization to guide data management, navigation, and utilization. Originally, publication, funding, and patent data were dominantly used in science of science studies. Today, researchers also study science news, job market, and even S&T twitter data—in real time. The monitoring, mining, modeling, and visualization of all relevant data streams in many different languages and the wide-spread distribution of results will require major computational infrastructures.

Is the visualization technology there to deal with big-data issues such as size and complexity?

The total number of scholars, papers, books, patents, and grants in existence today is rather small when compared with datasets generated and mined in medicine, physics, or meteorology. Tools like the Network Workbench (NWB) or Science of Science (Sci2) tool can extract, analyze, and layout networks with a million nodes. The NIH Map Viewer lets you interactively browse multiple years of funding by the National Institutes of Health. The mapping sustainability project maps seven different paper, patent, and funding datasets in geospatial and topic space (

Do you feel that the explosion in software solutions for network data management and visualization (including Network Workbench) has made life easier or harder for network scientists and in what way?

The Network Workbench was designed for researchers, educators, and practitioners interested in the study of biomedical, social and behavioral science, physics, and other networks and has been downloaded by 110,000 users around the globe. The Science of Science (Sci2) Tool was developed for science policy makers and researchers that study science by scientific means. It supports temporal, geospatial, topical, and network analysis and visualization of scholarly datasets at the micro (individual), meso (local), and macro (global) levels, and is used by many scholars, practitioners, and major agencies such as the National Science Foundation, the National Institutes of Health, the US Department of Agriculture, and the National Oceanic and Atmospheric Administration. While the agencies cannot share their data holdings, they can apply the very same analysis workflow to their internal data and compare results across agency boundaries. Plus, program officers and analysts can use Sci2 to download and analyze relevant data from, for example, Elsevier’s Scopus or Thomson Reuter’s Web of Science databases. Instead of getting a report from a contractor they now have an easy-to-use tool to answer their very own questions.

For more information about the exhibit please contact:

Prof. Katy Börner

School of Library & Information Science
Indiana University, Bloomington

In order to advance the understanding and support education on the power of scientific mapping and visualization, Elsevier has purchased two sets of maps from the Mapping Science Exhibit.

Each set contains nine maps that were selected from different iterations. The sets contain three scientific trends maps, three technology related maps, and three social/environmental maps.

We at Elsevier would like to offer free shipment of the maps to Elsevier’s Research4Life eligible universities and research institutions around the world that are interested in displaying them on a temporary basis. Training and educational seminars will be available to interested parties.

To check your institution eligibility please visit:

For further information about the traveling exhibit please contact:

Gali Halevi. MLS., PhD
Director of Government Segment Marketing
Phone: 646-248-9464

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Did you know

Dr Judith Kamalski

…that the since November 2011 Jamie Hyneman and Adam Savage — presenters of the science entertainment show Mythbusters — have been the proud owners of honorary doctorates from the University of Twente in the Netherlands1? Jamie and Adam have unique backgrounds in robotics, model-making and special effects, and in each episode of Mythbusters they confirm or ‘bust’ various theories by simply putting them to the test, usually with the aid of spectacular experimental set ups. Their approach has made science, technology, and investigative principles both fun and cool to millions of viewers worldwide.

The Mythbusters’ experiments are always dramatic, but some episodes really stand out. One of Jamie’s favorites is when they investigated whether a lead balloon could fly — so they built one of a thin sheet of metal, and filled it with helium (it did indeed fly)2. Adam mentions the time they wanted to know whether a penny dropped from the Empire State Building can kill a person. The Mythbusters failed to produce any damage with a dropping penny, even when it was fired from a rifle. Visiting the Empire State Building, they realized that updrafts and roofs of lower floors would prevent the penny from ever reaching street level. “I came up with the idea of creating a wind tunnel with a different velocity at the bottom than at the top. A tumbling penny should move up and down in the wind tunnel, and it did. That is the kind of thing I really enjoy.”2

Can bibliometric information help us decide in a more objective way what the best topics on Mythbusters were? A Scopus search for Mythbusters in title, abstract or keywords returns 17 documents. Commonly used words in these article titles are: education, classroom and science. In the list of topics on the Mythbusters’ site, the commonly used words are, unsurprisingly, words such as ‘bust’, ‘test’, ‘know’, but also words such as ‘turkey’, ‘shark’, and ‘government’. If this has sparked your interest, find out more at!

2. Aldous, P. (2009) Interview: the Mythbusters. The New Scientist. Vol. 203, 9 September, pp. 28–29.

  • Elsevier has recently launched the International Center for the Study of Research - ICSR - to help create a more transparent approach to research assessment. Its mission is to encourage the examination of research using an array of metrics and a variety of qualitative and quantitive methods.