Issue 29 – July 2012


Merit, expertise and measurement: a new research program at CWTS

The Centre for Science and Technology Studies at Leiden University has developed a new research program focusing on monitoring and analyzing knowledge flows and on research evaluation. The program, which will be published this Fall, introduces new approaches to these well-established goals of scientometric research. With the development of this new program, first, we move […]

Read more >

The Centre for Science and Technology Studies at Leiden University has developed a new research program focusing on monitoring and analyzing knowledge flows and on research evaluation. The program, which will be published this Fall, introduces new approaches to these well-established goals of scientometric research. With the development of this new program, first, we move from data-centric methods justified by ad-hoc reasoning towards a systematic theory-based framework for developing bibliometric and scientometric indicators. Second, in interpreting and applying performance indicators we increasingly base ourselves on the systematic analysis of current scientific and scholarly practices rather than only on general statistical arguments. Specific attention is paid to humanities and social sciences because of the variety of its research and publication practices. We also analyze the impact of research assessment exercises, and the performance criteria applied, on the primary process of knowledge production. Third, we explore the possibilities and problems in assessing the societal impact of research (“social quality”). Increasingly, this dimension is becoming the second pillar of research evaluation next to scientific impact and is creating a new challenge for science evaluation and assessment.

To sum up, we maintain the tried and trusted CWTS focus on bibliometrics for research evaluation, but we deepen our theoretical work and increase our empirical scope. Our new research agenda is a response to the widespread use of bibliometrics in performance based research management. We hope it will help prevent abuse of performance measures and thereby contribute to the development of good evaluation practices. We aim to bring scientometrics to a new level of quality in close collaboration with our colleagues in the field. This should also lead to new international standards of quality for assessments and science & technology indicators.

Figure 1 - Paul Wouters at a workshop of the Russian Academy of Sciences in St Petersburg, titled “Career Development in Academia”, 5–6 June 2012.

Research question

How can we improve our understanding of the dynamics of science, technology, and innovation by the measurement and assessment of the scientific and scholarly system, in particular of scientific products, communication processes and scholarly performance? This is the overarching theme of the new research program. In response, two specific research questions are in focus:

  1. How do scientific and scholarly practices interact with the “social technology” of research evaluation and monitoring knowledge systems?
  2. What are the characteristics, possibilities and limitations of advanced metrics and indicators of science, technology and innovation?

Key research themes

The first research theme in the program is the methodology of bibliometrics. Both at CWTS and elsewhere, the development of bibliometric indicators for research assessment has long been done in a pragmatic way. Indicators were developed without explicitly incorporating them in a broader mathematical or statistical framework. Indicators were justified mainly using empirical arguments. This resulted in a data-centric approach where the interpretation of the chosen indicators was developed in an ad-hoc fashion. In the new program we move towards a theory-oriented approach; indicator development will become more and more based on explicit theoretical models of the scientific publication and citation process. In this framework, the indicators will be judged on their mathematical and statistical properties. These models will for instance allow us to distinguish between observable and non-observable features of the publication and citation process (e.g., between the observable concept of citation impact and non-observable concepts such as scientific influence or quality). Model-based indicator development has the advantage of making an explicit distinction between what one intends to measure and what one is in fact measuring. This helps us to study the properties of bibliometric indicators (e.g., validity and reliability or bias and variance) in a more formalized way. The limitations of the indicators should be made explicit as well. For example, a complex concept such as scientific impact cannot be measured by one indicator. This is the reason we have moved from emphasizing one indicator (e.g. “the crown indicator”) towards a portfolio approach to performance indicators.

The new program also pays increasing attention to bibliometric network analysis and science maps. Bibliometric networks are networks of, for instance, publications, journals, researchers, or keywords. Instead of focusing on the properties of individual entities in a network, bibliometric network analysis concentrates on the way in which relations between entities give rise to larger structures, such as clusters of related publications or keywords. In this sense, bibliometric network analysis is closely related to the analysis of complex systems. The main objective of our research into bibliometric network analysis will be to provide content and context for research assessment purposes. Science maps enable us to analyze both the citation impact of a research group and its relationships with other groups. It also enables the analysis of interdisciplinary research without having to rely on predefined subject classifications. An interesting application is the visualization of the actual field profiles of research groups and scientific journals. We can also map the citation networks of journals at all levels of aggregation (see Figure 2).

Figure 2 - A map of journals based on citation relations. More maps can be found at

The second research theme in the program relates to the way evaluation processes configure the primary process of knowledge creation. The key question is that of the relationship between peer review based and indicator based evaluation. In the past, CWTS has dealt with this tension in a pragmatic way, using indicators to provide useful information to supplement peer review. As explained earlier, we will move towards a more systematic, theory based, approach in which we will probe in much more detail how expertise develops in particular scientific fields in relation to the bibliometric insights of those fields. We will not assume that the two ways of evaluating the quality of scientific and scholarly work are diametrically opposed: this would amount to setting up a straw man. In practice, peer review and bibliometrics are combined in a variety of ways. But how these combinations are developed by both evaluating institutions and the researchers that are being evaluated is not self-evident. Because it is exactly this interplay where the criteria for scientific quality and impact are being developed, we zoom in on this aspect of research evaluation.

Research evaluation may take different forms: annual appraisal interviews, institutional research assessment exercises, and global assessments of national science systems. Evaluation is a more complex interaction than simply the measurement of the performance of the researcher. We see it as a communication process in which both evaluators and the researcher under evaluation define what the proper evaluation criteria and materials should be. Therefore, we are especially interested in the intermediate effects of the process of evaluation on the researcher, evaluator, and on the development of assessment protocols.

Within this theme specific attention is paid to the “constructive” effects of research evaluation (including perverse effects). Evaluation systems inevitably produce (construct) quality and relevance as much as they measure it. This holds both for indicator based evaluation and for qualitative peer review evaluation systems. Evaluation systems have these effects because they shape the career paths of researchers and because they form the quality and relevance criteria that researchers entertain. These feedback processes also produce strategic behavior amongst researchers which potentially undermines the validity of the evaluation criteria. We therefore place focus on how current and new forms of peer review and indicator systems as main elements of the evaluation process will define different quality and relevance criteria in research assessment, on the short term as well as on the longer term. The recent anxiety about perverse effects of indicators such as the Hirsch-index will also be an important topic in this research theme. This theme will also encompass a research program about the development of scientific and scholarly careers and academic leadership.

Questions regarding the socio-economic and cultural relevance of scientific research form our third research theme. From the perspective of the knowledge-based society, policy makers stress the importance of “knowledge valorisation”. This term is used for the transfer of knowledge from one party to another with the aim of creating (economic and societal) benefits. However, the use of the word is often limited: only describing the transfer of knowledge to the commercial sector. The value in other domains, for example in professional or public domains, is often not taken into account. Also, the term valorisation is often used to describe a one-way-interaction, the dissemination of scientific knowledge to society, while in practice we often observe more mutual, interactive processes.

Within this research theme, we will therefore use the concept of “societal quality” in analyzing the societal impact of research. “Societal quality” is described as the value that is created by connecting research to societal practice and it is based on the notion that knowledge exchange between research and its related professional, public and economic domain strengthens the research involved. This definition encompasses explicitly more than economic value creation only. It also entails research that connects to societal issues and interactions with users in not-for profit sectors such as health and education as well as to the lay public. In the program we focus on the development of robust data sets, as well as the analysis of these datasets, in the context of specific pioneering projects in which the interaction between research and society can be well defined. This will create the possibility to construct, measure, and test potential indicators of societal impact.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

The Integrated Impact Indicator (I3), the top-10% Excellence Indicator, and the use of non-parametric statistics

Competitions generate skewed distributions. For example, a few papers are highly cited, but the majority is not or hardly cited. The skewness in bibliometric distributions is reinforced by mechanisms which have variously been called “the Matthew effect” (1), “cumulative advantages” (2) and “preferential attachment” (3). These mechanisms describe the “rich get richer” phenomenon in science. […]

Read more >

Competitions generate skewed distributions. For example, a few papers are highly cited, but the majority is not or hardly cited. The skewness in bibliometric distributions is reinforced by mechanisms which have variously been called “the Matthew effect” (1), “cumulative advantages” (2) and “preferential attachment” (3). These mechanisms describe the “rich get richer” phenomenon in science. Skewed distributions should not be studied in terms of central tendency statistics such as arithmetic means (4). Instead, one can use non-parametric statistics, such as the top-1%, top-10%, etc.

Figure 1 - Citation distributions for Nature Nanotechnology (N = 199 publications) and Nano Letters (N = 1,506). Source: (5).

In Figure 1, for example, the 2009 citation distributions of citable items in 2007 and 2008 in two journals from the field of nanotechnology (Nano Letters and Nature Nanotechnology) are compared using a logarithmic scale. The Impact Factor (IF) 2009 of the latter journal is almost three times as high as the one of the former because the IF is a two-year average. Using the number of publications in the previous two years (N) in the respective denominators erroneously suggests that Nano Letters had less impact than Nature Nanotechnology. If one instead considers the citation distributions in terms of six classes — top 1%, top-5%, etc. (Figure 2) — Nano Letters outperforms Nature Nanotechnology in all classes.

Figure 2: Frequency distribution of six percentile rank classes of publications in Nano Letters and Nature Nanotechnology, with reference to the 58 journals of the WoS Subject Category “nanoscience & nanotechnology.” Source: (5).

These six classes have been used by the US National Science Board (e.g., 6) for the Science and Engineering Indicators for a decade. By attributing a weight of six to each paper in the first class (top-1%) and five to each paper in the second class, etc., the stepwise function of six so-called “percentile-rank classes” (PR6) in Figure 2 can be integrated using the following fomula: . In this formula, x represents the percentile value and f(x) the frequency of this rank. For example, i = 6 in the case above, or i = 100 when using 100 equal classes such as top-1%, top-2%, etc.

Measuring “integrated impact” with I3 and/or PR6

Under the influence of using impact factors, scientometricians have confused impact with average impact: a research team as a group has more impact than one leading researcher, but the leading researcher him/herself can be expected to have more average impact, that is, citations per publication (c/p). Existing bibliometric indicators such as IF and SNIP are based on central tendency statistics, with the exception of the excellence indicator of the top-10% most-highly cited papers which is increasingly used in university rankings (7,8; cf. 9,10). An excellence indicator can be considered as the specification of two classes: excellent papers are counted as ones and the others as zeros.

Leydesdorff & Bornmann called this scheme of percentile-based indicators I3 as an abbreviation of “integrated impact indicator” (11). I3 is extremely flexible because one can sum across journals and/or across nations by changing the systems of reference. Unlike using the arithmetic mean as a parameter, the percentile-normalized citation ranks can be tested using non-parametric statistics such as chi-square or Kruskall-Wallis because an expectation can also be specified. In the case of hundred percentile rank classes, 50 is the expectation, but because of the non-linearity involved this expectation is 1.91 for the six classes used above (12). Various tests allow for comparing the resulting proportions with the expectation in terms of their statistical significance (e.g., 7,13).

Figure 3 - Citation distributions and percentile ranks for 23 publications of PI 1 and 65 publications of PI 2, respectively. Source: (14).

The outcome of evaluations using non-parametric statistics can be very different from using averages. Figure 3, for example, shows citation profiles of two Principal Investigators (PIs) of the Academic Medical Center of the University of Amsterdam (using the journals in which these authors published as the reference sets). In this academic hospital the averaged c/p ratios are used in a model to allocate funding, raising the stakes for methods of assessing impact and inciting the researchers to question the exactness of the evaluation (15). The average impact (c/p ratio) of PI1, for example, is 70.96, but it is only 24.28 for PI2; the PR6 values as a measure of integrated impact, however, show a reverse ranking: 65 and 122, respectively (14). This difference is statistically significant.

I3 quantifies the skewed citation curves by normalizing the documents first in terms of percentiles (or the continuous equivalent: quantiles). The scheme used for the evaluation can be considered as the specification of an aggregation rule for the binning and weighting of these citation impacts; for example as above, in terms of six percentile rank classes. However, policy makers may also wish to consider quartiles or the top-10% as in the case of an excellence indicator. Bornmann & Leydesdorff, for example, used top-10% rates for showing cities with research excellence as overlays to Google Maps using green circles for cities ranked statistically significantly above and red circles for ones below expectation (9).

Conclusions and implications

The use of quantiles and percentile rank classes improves impact measurement when compared with using averages. First, one appreciates the skewness of the distribution. Second, the confusion between impact and average impact can be solved: averages over skewed distributions are not informative and the error can be large. Using I3 with 100 percentiles, a paper in the 39th percentile can be counted as half the value of one in the 78th percentile. Using PR6, alternatively, one would rate the latter with a 4 and the former with a 6. Thus, the use of I3 allows thirdly for the choice of normative evaluation schemes such as the six percentile ranks used by the NSF or the excellence indicator of the top-10%. Fourth, institutional and document-based evaluations (such as journal evaluations) can be brought into an encompassing framework (5). These indicators are finally well suited for significance testing so that one can also assess whether “excellent” can be distinguished from “good” research, and indicate error bars. Different publication and citation profiles (such as between PI1 and PI2 in Figure 3) can thus be compared and uncertainty be specified.

Loet Leydesdorff* & Lutz Bornmann**

*Amsterdam School of Communication Research, University of Amsterdam, Kloveniersburgwal 48, NL-1012 CX, Amsterdam, The Netherlands;
**Division for Science and Innovation Studies, Administrative Headquarters of the Max Planck Society, Hofgartenstr. 8, D-80539 Munich, Germany;


1. Merton, R. K. (1968) “The Matthew Effect in Science”, Science, 159, 56-63.
2. Price, D. S. (1976) “A general theory of bibliometric and other cumulative advantage processes”, Journal of the American Society for Information Science, 27(5), 292-306.
3. Barabási, A.-L. (2002) Linked: The New Science of Networks. Cambridge, MA: Perseus Publishing.
4. Seglen, P. O. (1992). The Skewness of Science. Journal of the American Society for Information Science, 43(9), 628-638.
5. Leydesdorff, L. (in press) “An Evaluation of Impacts in “Nanoscience & nanotechnology:” Steps towards standards for citation analysis”, Scientometrics. .
6. National Science Board (2012) Science and Engineering Indicators. Washington DC: National Science Foundation.
7. Bornmann, L., de Moya-Anegón, F., & Leydesdorff, L. (2012) “The new excellence indicator in the World Report of the SCImago Institutions Rankings 2011”, Journal of Informetrics, 6(3), 333-335.
8. Leydesdorff, L., & Bornmann, L. (in press) “Testing Differences Statistically with the Leiden Ranking”,  Scientometrics.
9. Bornmann, L., & Leydesdorff, L. (2011) “Which cities produce excellent papers worldwide more than can be expected? A new mapping approach—using Google Maps—based on statistical significance testing”, Journal of the American Society for Information Science and Technology, 62(10), 1954-1962.
10. Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E. C. M., Tijssen, R. J. W., van Eck, N. J., . . . Wouters, P. (2012). The Leiden Ranking 2011/2012: data collection, indicators, and interpretation.
11. Leydesdorff, L., & Bornmann, L. (2011) “Integrated Impact Indicators (I3) compared with Impact Factors (IFs): An alternative design with policy implications”, Journal of the American Society for Information Science and Technology, 62(11), 2133-2146. doi: 10.1002/asi.21609
12. Bornmann, L., & Mutz, R. (2011) “Further steps towards an ideal method of measuring citation performance: The avoidance of citation (ratio) averages in field-normalization”, Journal of Informetrics, 5(1), 228-230.
13. Leydesdorff, L., Bornmann, L., Mutz, R., & Opthof, T. (2011) “Turning the tables in citation analysis one more time: Principles for comparing sets of documents”, Journal of the American Society for Information Science and Technology, 62(7), 1370-1381.
14. Wagner, C. S., & Leydesdorff, L. (2012, in press). An Integrated Impact Indicator (I3): A New Definition of “Impact” with Policy Relevance. Research Evaluation.
15. Opthof, T. and L. Leydesdorff (2010) "Caveats for the journal and field normalizations in the CWTS (“Leiden”) evaluations of research performance", Journal of Informetrics 4(3), 423-430.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Bibliometrics and Urban Research, part II: Mapping author affiliations

The previous issue of Research Trends presented a preliminary keyword analysis of urban research, in which three branches of the overall discipline are defined and contrasted. The analysis shows that not only do researchers in these three areas discuss different elements of urban studies, they also tend to be based in different countries. Together these […]

Read more >

The previous issue of Research Trends presented a preliminary keyword analysis of urban research, in which three branches of the overall discipline are defined and contrasted. The analysis shows that not only do researchers in these three areas discuss different elements of urban studies, they also tend to be based in different countries. Together these suggest a “limited integration of research efforts undertaken by those who work explicitly in urban studies, social scientists who work in cities, and scientists who are concerned with the environmental impacts of urban development.” (1,2)

As well as looking at the countries that authors come from, it is also possible to look at author distributions in finer detail: rather than assigning all authors with a UK affiliation to the nation as a whole, we can view the specific locations of each affiliation on a map (and only group together those that are actually in the same place). The methods used to map author affiliations from the Scopus database are set out by Bornmann et al., (3) and here we follow their process to show author distributions in the three branches of urban research: Sciences, Social Sciences and Urban Studies.

The affiliation plot

There are certain differences when you work using a full author affiliation, rather than country data alone. First, papers can be assigned to multiple locations within a country: for example, a paper co-authored by researchers from institutes in Lille and Paris is shown at both locations, rather than as a single paper for France. Second, distributions within a country can be seen: for example, the capital city might be host to all of the active researchers in a country, or they could be spread across the country. Third, you can make direct comparisons between cities or institutes to see which published the most.

The first grouping of urban research consists of relevant papers within a set of 38 journals assigned to the Thomson-Reuters urban studies cluster. We have seen that papers come mainly from the US, the UK, Australia, Canada and Netherlands; but there is a long list beyond the top 5, and it quickly becomes difficult to retain a sense of all the countries. Plotting the locations on a map immediately shows you the distribution of authors and the quantities from different regions of the world (see Figure 1a).

Figure 1a - Distribution of urban studies authors in 2010. Following the method described by Bornmann et al. (3), circles are sized and colored according to the number of papers originating from each location. Data source: Scopus

Large countries such as the US, Australia and China benefit particularly from such a map, as institutes across the country can be located and compared. In China’s case, there are multiple papers from Beijing, Shanghai, Wuhan, Nanjing, Guangzhou, as well as Hong Kong.

The map also allows you to see the overall distribution at a single glance, including both the strong contributions in Europe and the US and the single papers from Argentina, Ghana, Nigeria, Ethiopia, Saudi Arabia, Pakistan, and Indonesia, among others.

We can also examine the same search over a number of years to see whether the distribution of authors changes over time. Figure 1b shows the publication years 2006 to 2010: while the smaller contributors appear and disappear each year, the larger locations remain fairly steady, and the concentration of authors in the US and Europe appears no weaker in 2010 than previous years.

Figure 1b - Distribution of urban studies authors in the years 2006 to 2010. Following the method described by Bornmann et al. (3), circles are sized and colored according to the number of papers originating from each location. Data source: Scopus

In the map of 2010 author affiliations 389 locations are marked, accounting for the 643 articles and reviews published. Each location therefore accounts for 1.65 papers on average; this represents a slight increase from previous years, when locations have on average accounted for 1.46 to 1.60 papers (see Table 1).

Publication year Locations Papers Papers per location
2006 344 529 1.538
2007 347 553 1.594
2008 335 490 1.463
2009 371 553 1.491
2010 389 643 1.653

Table 1 - The number of locations (in author affiliations) for each year, and the number of papers published in each year in the urban studies grouping. Source: Scopus

From one discipline to another

The other two branches of urban research are those published in Social Science and in Science journals, respectively. These can be compared using the same approach as that used above, but instead here we alter the approach to look at only the authors of the top-cited papers in each discipline. As we are including both articles and reviews in the analysis, but these types of papers have different expected numbers of citations, we rank the articles and reviews separately, and take the top 10% of each according to citations.  This allows us to map the distribution of the authors of the highest-impact articles and reviews together. Figure 2 shows the resulting distributions in the Social Sciences and Science clusters, plotted in different colors. Differences are apparent through a comparison of red (Social Science) and cyan (Science) authors. Some regions, such as South Africa and Australia, have more prominence in the Social Sciences; others, such as continental Europe, show a greater presence in the Sciences.

Figure 2 - Distribution of highly-cited Social Science (red) and Science (cyan) urban research authors in 2010. Where authors in the different disciplines are from the same location, this is shown by a darker red or darker cyan than where there is no overlap. Data source: Scopus

The maps of author affiliations show a finer level of detail than any aggregated country data can provide; and they allow for much more immediate interpretation of the affiliation data. We looked at the distributions of authors — whether including all authors, or only highly-cited authors — in the three identified branches of urban research.

There are two elements that may improve this approach further. The first is to include impact data more directly in the mapping process. The second would be to look at collaboration; here papers are duplicated for each affiliation, and there is no sense of the partnerships that go into that creation; a comparison of the collaborative trends in the various urban research clusters would add even deeper insight into their natures.


1. Kirby, A., & Kamalski, J. (2012) “Bibliometrics and Urban Research”, Research Trends, No. 28.
2. Kamalski, J., & Kirby, A. (2012, in press) “Bibliometrics and urban knowledge transfer”, Cities.
3. Bornmann, L. et al. (2011) “Mapping excellence in the geography of science: An approach based on Scopus data”, Journal of Informetrics, Vol. 5, No. 4, pp. 537–546.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Identifying emerging research topics in Wind Energy research using author given keywords

The value of well constructed thesauri as means for effective searching and structuring of information is something a seasoned searcher is very familiar with. Thesauri are useful for numerous information management objectives such as grouping, defining and linking terms, and identifying synonyms and near-synonyms as well as broader and narrower terms. Searches based on thesauri […]

Read more >

The value of well constructed thesauri as means for effective searching and structuring of information is something a seasoned searcher is very familiar with. Thesauri are useful for numerous information management objectives such as grouping, defining and linking terms, and identifying synonyms and near-synonyms as well as broader and narrower terms. Searches based on thesauri terms are considered better in terms of both recall and precision (1,2,3).

Yet the construction of a comprehensive thesaurus is a laborious task which often requires the intervention of an indexer who is expert in the subject. Terms incorporated in a thesaurus are selected carefully and examined for their capability to describe content accurately while keeping the integrity of the thesaurus as a whole. Terms incorporated in a thesaurus are referred to as controlled vocabulary or terms. Uncontrolled vocabulary on the other hand, consists of freely assigned keywords which the authors use to describe their work. These terms can usually be found as a part of an abstract, and appear in most databases as “author keywords” or “uncontrolled vocabularies”. In today’s fast moving world of science where new discoveries and technologies develop rapidly, the pace by which thesauri capture new areas of research may be questioned, and so the value of now using author keywords in retrieving new, domain-specific research should be examined.

This study sought to examine the manners by which thesauri keywords and author keywords manage to capture new and emerging research in the field of “Wind Energy”. The research questions were as follows:

  1. Do author keywords include new terms that are not to be found in a thesaurus function?
  2. Can new areas of research be identified through author keywords?
  3. Is there a time lapse between the appearance of a keyword assigned by an author and its appearance in a thesaurus?


In order to answer these questions we analyzed controlled and uncontrolled terms of 4000 articles grouped under the main heading “Wind Power” in Compendex captured between the years 2005–2012. Compendex is a comprehensive bibliographic database of scientific and technical engineering research available, covering all engineering disciplines. It includes millions of bibliographic citations and abstracts from thousands of engineering journals and conference proceedings. When combined with the Engineering Index Backfile (1884-1969), Compendex covers well over 120 years of core engineering literature.

In each Compendex record a list of controlled and uncontrolled terms are listed and can be searched on.  Over 17,000 terms were extracted from the Compendex records and sorted by frequency. Two separate files were created; one depicting all the controlled terms and the second depicting the author given keywords (i.e. uncontrolled terms). For each term a count of the number of times they appear in each year from 2005–2012 and the total number of articles in which each term appears was recorded. In addition, a simple trend analysis compared the number of the times each term appears on average in papers published during the years 2009–2012 with the same measure calculated for 2005–2008. This trend analysis allowed for a view of terms that increase in usage in the past 3 years, compared to the overall time period.

To answer the research questions, the following steps were taken:

  1. All author keywords that appear 100 times or more were collected.
  2. The author keywords were searched in the Compendex Thesaurus: if an author keyword appeared, the year in which it was introduced was recorded.
  3. The author keyword was then searched for in Compendex across all years and the year in which it first appeared was recorded.
  4. The author keywords that appeared more than 100 times were grouped into themes. In addition these author keywords were searched for in Compendex in order to identify their corresponding articles and the topics they cover.


Table 1 shows the most recurring uncontrolled terms. The terms were categorized in 4 groups as follows:

Topic Group Environment Mechanics Integration Computerization
Uncontrolled terms Renewable energies 

Renewable energy source

Wind energy

Wind speed

Wind Resources

Doubly-fed induction generator 

Offshore wind farms

Permanent magnet

Synchronous generator

Wind farm(s)

Wind turbine generators

Wind generators

Wind generation

Wind energy conversion system

Control strategies 

Power grids

Power output


Simulation result

Table 1 - Most recurring uncontrolled terms in the retrieved articles. Source: Engineering Village

Looking at the corresponding literature within Compendex, there were three main topics that emerged from the author key words which indicate specialized areas of research within the overall ‘wind power’ main heading. These terms did not appear in the Compendex thesaurus.

Wind Farms: This term first appeared as an uncontrolled term (i.e. Author keywords) in 1985 in an article by NASA researchers (4). The term refers to large areas of land on which wind turbines are grouped. Some examples of such wind farms are The Alta Wind Energy Center (AWEC) which is located in the Tehachapi-Mojave Wind Resource Area in the USA and the Dabancheng Wind Farm in China. This research includes a wide variety of topics ranging from agriculture, turbines mechanics, and effects on the atmosphere and power grid integrations. The term has shown substantial growth in use as an author keyword between 2006 and 2012 with peak of 757 articles in 2011 (see Figure 1).

In the thesaurus, however, this term is included under “Farm buildings” which also contains livestock buildings and other structures that are to be found in farms.

Figure 1 - Use of keyword Wind Farm by authors. Source: Engineering Village

Offshore wind farms: This term first appeared as an uncontrolled terms in 1993 (5) and refers to the construction of wind farms in deep waters. Some examples of such wind farms include Lillgrund Wind Farm in Sweden and Walney in the UK.   In the thesaurus articles with this keyword are assigned to the term “Ocean structures”. This of course includes other structures such as ocean drilling, gas pipelines and oil wells. The use of this term has been steadily growing (see Figure 2) with substantial increase between 2008 and 2011.

Figure 2 - Use of keyword Offshore Wind Farms by authors. Source: Engineering Village

Most surprisingly, however, is the fact that the term Wind energy itself doesn’t appear in the thesaurus at all. The topic as a whole appears under “Wind Power” which also applies to damages caused by wind, wind turbulences, wind speed and so forth. The term has been used by authors since 1976 and first appeared in an article by the Department of the Environment, Building Research Establishment of UK Government (6), and has seen constant growth between 2006 and 2012 (see Figure 3).

Figure 3 - Use of keyword Wind Energy by authors. Source: Engineering Village

Other emerging topics include: wind energy integration into power grids, effects of wind farms on the atmosphere, wind farms and turbines computer simulations and control software.  In addition, comparing the uncontrolled and controlled terms that appeared most commonly there are apparent differences in foci as they emerge from the vocabulary. While the uncontrolled vocabulary highlights Wind speed and farms, the controlled vocabulary features Wind power, Electric utilities, and Turbomachine blades. This could be due to the fact that the Compendex thesaurus is engineering focused, thus giving the mechanics of wind power conversion prominent descriptors. In this case, the author given keywords are valuable and they provide a supplementary view on these topics by depicting the environmental aspects of these research articles. Table 2 illustrates the different foci of the keywords.

Uncontrolled terms Controlled terms
Wind speed (43 papers, 10%) Wind power (172 papers, 41%)
Wind farm (37, 9%) Wind turbines (171, 41%)
Wind farms (22, 5%) Computer simulation (74, 18%)
Wind turbine blades (17, 4%) Mathematical models (73, 18%)
Fatigue loads (12, 3%) Aerodynamics (72, 17%)
Wind energy (12, 3%) Electric utilities (63, 15%)
Wind turbine wakes (12, 3%) Turbomachine blades (58, 14%)
Control strategies (11, 3%) Wind effects (49, 12%)
Offshore wind farms (11, 3%) Rotors (48, 12%)
Power systems (11, 3%) Wakes (45, 11%)

Table 2 - Most common controlled and uncontrolled terms on search. Source: Engineering Village


Wind energy is by no means a new area of exploration, yet in the past 4 to 5 years this area has seen a considerable growth in research output especially in wind turbines technology and wind harvesting. Although the data sample analyzed is small and covers one subject field only, our findings illustrate that author keywords may indeed include new terms that are not to be found in a thesaurus function. The use of thesauri terms is usually recommended as a part of precision strategy in searching. Yet, in our case controlled terms have a more general scope. Table 3 below summarizes some of our major conclusions as they pertain to the properties of using author-given keywords and controlled terms in the search process. Our findings show that the use of author given keywords as a search strategy will be beneficial when one searches for more specific technologies and applications or new research areas within the overall topic (see Table 3).

Controlled Uncontrolled Notes
Recall Using controlled terms retrieves a larger number of articles since they are lumped under broader descriptors.
Precision Uncontrolled terms are very specific and enable retrieval of detailed topics.
Discoverability Uncontrolled terms enable the discovery of the new topics and can serve as indicators of the latest discoveries made in this field. Controlled terms enable the clustering of such topics thus enabling connections between larger numbers of articles and topics.
Serendipity Controlled terms are broader thus retrieving a larger amount of article and enabling serendipity through browsing.
State of the Art Uncontrolled terms depict the latest descriptors of methods, applications and processes in a certain topic.

Table 3 - Evaluation of the impact of controlled and uncontrolled terms on search.

Our analysis showed, for example, that strongly emerging areas identified in our sample are wind farms and offshore wind farms. These terms, although appearing in the author given keywords for over 20 years have not entered the Compendex thesaurus. This could be due to the fact that the Compendex database is engineering-focused and built to serve engineers therefore grouping these articles under terms that are mechanical in nature. However, this might hinder a broader understanding of the topics in context.

In this case using the thesaurus as basis for searching Wind Energy articles would create broader results sets. Depending on what the purpose of the search is, this could be viewed as a positive or negative outcome. Our analysis shows that the two types of terms have different properties and serve different purposes in the search process. In the analysis of emerging topics author-given keywords are useful tools, as they enable one to specify a topic in a way that seems difficult to carry out when one uses only terms from a controlled thesaurus.


1. Sihvonen, A., Vakkari, P. (2004)”Subject knowledge, thesaurus-assisted query expansion and search success”, Proceedings of RIAO2004 Conference, pp. 393-404.
2. Sihvonen, A., & Vakkari, P. (2004) “Subject knowledge improves interactive query expansion assisted by a thesaurus”, Journal of Documentation, 60(6), 673-690.
3. Shiri, A.A.,Revie, C.,Chowdhury, G. (2002) “Thesaurus-enhanced search interfaces”, Journal of Information Science, Volume 28, Issue 2, 2002, Pages 111-122.
4. Neustadter, H. E., & Spera, D. A. (1985) “Method for Evaluating Wind Turbine Wake Effects on Wind Farm Performance”, Journal of Solar Energy Engineering, Transactions of the ASME, 107(3), 240-243.
5. Olsen, F., & Dyre, K. (1993) “Vindeby off-shore wind farm - construction and operation“, Wind Engineering, 17(3), 120-128.
6. Rayment, R. (1976) “Wind Energy in the UK”, Building Services Engineer, (44), 63-69.
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Research Evaluation in Practice: Interview with Linda Butler

During your career, you have taken part in government-driven research projects using bibliometrics methodologies. Could you give an example or two of the outcomes of these research projects and the way they informed scientific funding? The most influential body of research I have undertaken relates to analyses of the way Australian academics responded to the […]

Read more >

Linda Butler

During your career, you have taken part in government-driven research projects using bibliometrics methodologies. Could you give an example or two of the outcomes of these research projects and the way they informed scientific funding?

The most influential body of research I have undertaken relates to analyses of the way Australian academics responded to the introduction of a sector-wide funding scheme that distributes research funding to universities on the basis of a very blunt formula.  The formula is based on data on research students, success in obtaining competitive grant income, and the number of research outputs produced.  For research outputs, a simple count is used.  It does not matter where a publication appeared – the rewards are the same.  By looking in detail at the higher education sector, and after eliminating other possible causal factors, I was able to demonstrate that the introduction of the formula led to Australian academics significantly increasing their productivity above long-term trend lines.  While the increase was welcome, what was of major concern to policy makers were the findings that the increase in output was particularly high in lower impact journals, and that Australia’s relative citation impact had fallen below that of a number of its traditional OECD comparators.

These findings were part, though not all, of the driver for Australia to introduce a new funding system for research.  The same blunt formula is still being used, but it is anticipated that much of the funding it distributes will before long be based on the results of the Excellence in Research for Australia (ERA) initiative, the second exercise of which will be conducted in 2014 (the first was held in 2012). The same research has also been influential in Norway and other Scandinavian countries where governments sought to avoid the pitfalls of simple publication counts by introducing a tiered system of outputs, with those in more prestigious journals or from more prestigious publishers receiving a higher weighting and therefore resulting in greater funding.

See also: Powerful Numbers: Interview with Dr. Diana Hicks

Examining the literature, there appear to be far more research evaluation studies focusing on life and medical sciences. Why, in your opinion, are these not as prevalent in the social sciences?

I believe this is primarily because quantitative indicators are seen as fairly robust in the biomedical disciplines and are therefore, on the whole, reasonably well accepted by researchers in those fields.  This is not the case for the social sciences.  There is nothing surprising in this. The biomedical literature is well covered by major bibliometric databases.  In addition, sociological studies have given us much evidence on the meaning of citations in the life sciences and this, together with evaluative studies that have been shown to correlate well with peer review, means researchers have some confidence that measures based on the data are reasonably robust – though always with the proviso they are not used as a blunt instrument in isolation from peer or expert interpretation of the results.

The same can’t be said for the social sciences (or the humanities and arts).  There is some evidence that a citation in these disciplines has a different meaning – their scholarship does not build on past research in the same way that it does in the life sciences.  It is also well known that coverage of the social sciences is very poor in many disciplines, and only moderate in the best cases.  Evaluative studies that use only the indexed journal literature have sometimes demonstrated poor correlation to peer review assessments, and there is understandably little confidence in the application of the standard measures used in the life sciences.

What can be done to measure arts & humanities as well as social sciences better?

I think the most promising initiatives are those coming out of the European Science Foundation, which has for a number of years been investigating the potential for a citation index specifically constructed to cover these disciplines.  The problem is that, as it would need to cover books and many journals not indexed by the major citation databases, it is a huge undertaking.  Given the current European financial climate I don’t have much confidence that this initiative will progress very far in the short-term.  It is also an initiative fraught with problems, as seen in the ESF’s first foray into this domain with its journal classification scheme. Discipline and national interest groups have been very vocal in their criticisms of the initial lists, and a citation index is likely to be just as controversial.

Many scholars in these disciplines pin their hopes on Google Scholar (GS) to provide measures that take account of all their forms of scholarship.  The problem with GS is that it is not a static database, but rather a search engine.  As GS itself clearly points out, if a website disappears, then all the citations from publications found solely in that website will also disappear, so over time there can be considerable variability in results, particularly for individual papers or researchers.  In addition, it has to date been impossible to obtain data from GS that would enable world benchmarks to be calculated – essential information for any evaluative studies.

Do you think that open access publishing will have an effect on journals’ content quality, citations tracking and general impact?

The answers to these questions depend on what “open access publishing” means.  If it refers to making articles in the journal literature that are currently only accessible through paid subscription services publicly available, I would expect the journal “gatekeepers” – the editors and reviewers – to continue with the same quality control measures that currently exist.  If all (or most) literature becomes open access, then the short-term citation advantage that is said to exist for those currently in open access form will disappear, but general impact could increase as all publications will have the potential to reach a much wider audience than was previously possible.

But if “open access publishing” is interpreted in its broadest sense – the publishing of all research output irrespective of whether or not it undergoes any form of peer review – then there is potential for negative impact on quality.  There is so much literature in existence that researchers need some form of assessment to allow them to identify the most appropriate literature and avoid the all too real danger of being swamped by the sheer volume of what is available.  Some form of peer validation is absolutely essential.  That is not to say that peer validation must take the same form as that used by journals – it may be in the form of online commentary, blogs, or the like – but it is essential in some format.

Any new mode of publication presents its own challenges for citation tracking.  On the one hand, open access publishing presents huge possibilities in a much more comprehensive coverage of the literature, and potential efficiencies in harvesting the data.  But on the other hand they present problems for constructing benchmarks against which to judge performance – how is the “world” to be defined?  Will we be able to continue using existing techniques for delineating fields?  Will author or institutional disambiguation become so difficult that few analysts will possess the knowledge and computer power required to do this?

What forms of measurements, other than citations, should be applied when evaluating research quality and output impact in your opinion? (i.e. usage, patents)

It is important to use a suite of indicators that is as multi-dimensional as possible.  In addition to citation-based measures, other measures of quality that may be relevant include those based on journal rankings, publisher rankings, journal impact measures (i.e. SNIP, SJR etc.) and success in competitive funding schemes.  Any indicator chosen must be valid, must actually relate to the quality of research, must be transparent, and must enable the construction of appropriate field-specific benchmarks.  Even then, no single indicator, nor even a diverse suite of indicators, will give a definitive answer on quality – the data still need to be interpreted by experts in the relevant disciplines who understand the nuances of what the data is showing.

Choosing indicators of wider impact is a much more fraught task.  Those that are readily available are either limited in their application (e.g. patents are not relevant for all disciplines), or refer merely to engagement rather than demonstrated achievement (e.g. data on giving non-academic presentations, or meetings with end-users attended).  And perhaps the biggest hurdle is attribution – which piece (or body) of work led to a particular outcome?  For this reason, the current attempts to assess the wider impact of academic research are focussing on a case study approach rather than being limited to quantitative indicators.  The assessment of impact in the UK’s Research Excellence Framework is the major example of such an approach currently being undertaken, and much information on this assessment approach can be found on the website of the agency overseeing this process – the Higher Education Funding Council of England.

See also: Research Impact in the broadest sense: REF 14

During your years as a university academic, did you notice a change among university leaders and research managers in the perception and application of bibliometrics?

From a global perspective, the biggest change has occurred since the appearance of university rankings such as the Jiao Tong and THE rankings.  Prior to this, few senior administrators had much knowledge of the use of bibliometrics in performance assessments, other than the ubiquitous journal impact factor. The weightings given to citation data in the university rankings now ensure that bibliometrics are at the forefront of universities’ strategic thinking and many universities have signed up to obtain the data that relates to their own university and use it internally for performance assessment.

In Australia, most university research managers had at least a passing knowledge of the use of bibliometrics in evaluation exercises by the 1990s, through the analyses undertaken by the unit I headed at The Australian National University, the Research Evaluation and Policy Project.  However their interest increased with the announcement that bibliometrics were to form an integral part of a new performance assessment system for Australian universities – the Research Quality Framework which was ultimately superseded by the ERA framework.  This interest was further heightened by the appearance of the institutional rankings mentioned above. While ERA is not currently linked to any substantial funding outcomes, it is expected to have financial implications by the time the results have been published from the second exercise to be held in 2014.  Australian universities are now acutely aware of the citation performance of their academics’ publications, and many monitor that performance internally through their research offices.

The downside of all this increased interest in, and exposure to, bibliometrics is the proliferation of what some commentators have labelled “amateur bibliometrics” – studies undertaken by those with little knowledge of existing sophisticated techniques, nor any understanding of the strengths and weaknesses of the underlying data.  Sometimes the data is seriously misused, particularly in its application to assessing the work of individuals.

What are your thoughts about using social media as a form of indication about scientific trends and researchers’ impact?

I have deep reservations about the use of data from social media to construct performance indicators. They relate more to popularity than to the inherent quality of the underpinning research, and at this point in time are incredibly easy to manipulate. They may be able to be used to develop some idea of the outreach of a particular idea, or a set of research outcomes, but are unlikely to provide much indication of any real impact on the broader community. As with many of the new Web 2.0 developments, the biggest challenge is determining the meaning of any data that can be harvested, and judging whether any of it relates to real impact on either the research community, on policy, on practice, or on other end-users of that research.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Did you know

… that scientometrics is explained in a recent novel?

Matthew Richardson

In Michael Frayn’s latest novel Skios, a cast of characters descend on a Greek island for two days of crossed identities and mislaid messages, in the best farcical tradition.
The climax of the affair is the Fred Toppler Lecture, which is set to be delivered this year on the topic of “Innovation and Governance: the Promise of Scientometrics”.
As Dr Norman Wilfred explains: “The results of scientific research are scientifically measurable. We have developed a discipline for this. It’s called scientometrics. And on the basis of scientometrics science can be scientifically managed.”
The ironic response: “This is your lecture, is it? … I see why you don’t want people to miss it.”
Is this the first novel to mention scientometrics?

Frayn, M. (2012) Skios. London: Faber & Faber. p. 132.

  • Elsevier has recently launched the International Center for the Study of Research - ICSR - to help create a more transparent approach to research assessment. Its mission is to encourage the examination of research using an array of metrics and a variety of qualitative and quantitive methods.