Is science in your country declining? Or is your country becoming a scientific super power, and how quickly?
Analysing longitudinal trends in the publication output of nations has a long tradition in the field of bibliometrics. Derek de Solla Price (1978)1 and Francis Narin (1976)2, two founding fathers of the field, began exploring the utility of this type of bibliometric analysis in the 1970s, and they continue to have a considerable impact both on scientific and research policy debates.
History of analyzing
This national-level focus on scientific output is captured in many articles published over the past 25 years: ‘The continuing decline of British science’ (Martin et al., 1986)3; La recherche française est-elle en bonne santé? (Callon & Leydesdorff, 1987)4; ‘The emergence of China as a leading nation in science’ (Zhou & Leydesdorff, 2006)5; ‘The race for world leadership of science and technology: status and forecasts’ (Shelton & Foland, 2009)6; ‘Tipping the balance: the rise of China as a science superpower (Plume, 20117; The Royal Society, 20118)’; and ‘Is Italian science declining?’ (Daraio & Moed, 2011)9.
Publication counting seems so simple. But one has to make a series of methodological decisions that specify precisely how the counting is carried out. These decisions determine the numbers that are generated, and should be taken into account when interpreting the outcomes and drawing conclusions from these figures. Table 1 lists ten crucial methodological factors in this process.
A case study: China versus US
Several recent studies have assessed trends in US publication output compared with China. They differ with respect to many factors listed in Table 1. All of these studies found both a decline in US output and an increase for China during specific sub-intervals within the 2000–2010 time period. They even ‘predict’, by means of extrapolation, the year in which China will surpass the US in total publication output. The extrapolated cross-over years differ among the various studies but range between 2013 and a date in the following decade.
A recent study carried out by Loet Leydesdorff10 compared measures of scientific publication output generated by web versions of Web of Science (WoS) and Scopus. While the WoS analysis showed a steady decline in US output during 2000–2010, the Scopus results suggested that the US had a constant world share of publications during 2004–2009, and increased its share in 2010. A study conducted at Elsevier replicated the findings derived from Scopus’ web version. However, Elsevier’s study also used results derived from a special bibliometric version of Scopus created at Elsevier, one that draws on the same raw data as in the web version but loads it into a different software environment and applies several data-cleaning processes.
Figure 1 shows the outcomes of this comparison. Notably, the results for the US differ considerably between the two Scopus versions. These discrepancies are due to the fact that not all author affiliations contain the name of the country in which the authors’ institutions are located. This is especially true for US affiliations: many indicate the US state, but not the country name. In Chinese publications, such a phenomenon occurs less frequently, possibly because Chinese authors find it important to highlight their country of origin.
In Elsevier’s bibliometric version of Scopus a large fraction of missing country names were added, which increased the measured number of publications; however, in the web version of Scopus this data cleaning is still ongoing (at present, only missing affiliations from 2010 are added). The process operates backwards in time: by the end of the year additions for the years 2005–2009 will be added. So the increase in US world share in 2010 previously derived from the web version of Scopus is due to more complete capturing of affiliation countries in that year.
What does this mean?
This case illustrates once more how careful one must be when interpreting bibliometric trend data (even at the level of countries), how outcomes can differ between one database version and another, how affiliation practices can differ among countries, and how these differences can affect both numbers and annual trends.
There is no absolute norm for what constitutes good database coverage. Scopus tends to have a more comprehensive coverage, especially of Chinese journals, while WoS has more selective journal coverage. Each gives a specific view of the US and Chinese output. Both databases give a declining trend for the US and an increasing one for China. The crossover times are different, and sooner for Scopus than for WoS, but this should be expected from a database that has a more comprehensive coverage of Chinese journals.
|Selection of a database||Which database does one use in the measurement? Coverage may differ substantially from one database to another.|
|Different versions of a database||Different versions of a database may exist. For instance, several groups have created their own bibliometric versions based on raw data from Scopus or Web of Science, adding information to it, performing data cleaning and so on. Results from such bibliometric versions may differ from those obtained with the web versions of the same databases|
|Changes in database coverage||Database coverage may change over time; for instance, new journals may be added from a particular year onwards. How does one deal with these changes?|
|Adequacy of database coverage||Does a database cover the publication output of a country and/or in a research field sufficiently well? For instance, databases principally covering journals miss important output in social sciences and humanities (published in books) and in engineering (published in conference proceedings)|
|Fractional versus integer counting||How should one count a paper co-published between a US and a UK author? As one US paper and one UK paper (integer counts)? Or as 0.5 US and 0.5 UK papers (fractional counts)? More sophisticated schemes can also be explored.|
|Absolute or relative counts||Does one analyze the absolute number of published articles, or article shares (for instance, the percentage of papers published from a particular country relative to the total number of articles indexed for the database)?|
|Time period considered||To which time period does the data collection relate? This is especially important when examining longitudinal data. For instance, a country may show an increase in some years, and a steady state or even decline in a subsequent time period.|
|Document types included in the counts||Databases index many types of documents: full research articles, but also shorter letters, reviews, editorials, discussion papers, and more. Which types should be included in the counts?|
|Publication year vs. database or tape year||A paper published at the end of a calendar year (e.g., in December 2010) may be included in the database in the next year (e.g., March 2011). Is such a paper counted as a 2010 or as a 2011 paper?|
|Country delimitation||Papers are assigned to countries according to the geographical location of the institutions of publishing authors. But how precisely is this done? Does the database include the affiliations of all authors? Have variations in country names been taken into account?|
Table 1 – Methodological issues in bibliometric analysis of nations
Figure 1 – Scopus Bib V: data from bibliometric version of Scopus created at Elsevier; Scopus Web V: The Web version of Scopus. Source: Scopus.
References1. Price, D.J.D. (1978) Towards a model for science indicators. In Toward a Metric of Science: The Advent of Science Indicators (eds Elkana, Y., Lederberg, J., Merton, R.K., Thackray, A. & Zuckerman, H.) (New York: John Wiley, pp. 69–95). 2. Narin, F. (1976) Evaluative Bibliometrics: The Use of Publication and Citation Analysis in the Evaluation of Scientific Activity. (Washington D.C.: National Science Foundation,). 3. Martin, B.R, Irvine, J., Narin, F. & Sterritt, C. (1987) The continuing decline of British science. Nature, Vol. 330, pp. 123–126. 4. Callon, M. & Leydesdorff, L. (1987). La recherche française est-elle en bonne santé? La Recherche Vol. 18, pp. 412–419. 5. Zhou, P., & Leydesdorff, L. (2006) The emergence of China as a leading nation in science. Research Policy, Vol. 35, pp. 83–104. 6. Shelton, R. D. & Foland, P. (2009) The race for world leadership of science and technology: status and forecasts. Proceedings of the 12th International Conference of the International Society for Scientometrics and Informetrics (eds Larsen, B & Larsen, J.), Volume I, pp. 369–380 (Rio de Janeiro, Brazil, July 14–17, 2009). 7. Plume, A. (2011) Tipping the balance: The rise of China as a science superpower. Research Trends, Issue 22. 8. The Royal Society (2011) Knowledge, Networks and Nations: Global Scientific Collaboration in the 21st Century. 9. Daraio, C. & Moed, H.F. (2011). Is Italian science declining? Research Policy, Vol. 40, pp 1380–1392. 10. Leydesdorff, L. (2011). World shares of publications of the USA, EU-27, and China compared and predicated using the new interface of the Web-of-Science versus Scopus. arXiv:1110.1802v2 [cs.DL].
Comment by Loet Leydesdorff:
When can the cross-over between China and the USA be expected using Scopus data?
Moed et al.’s article1 is a reaction to a recent paper2 in which I showed that the cross-over between China and the USA would be postponed until after 2020 when using the Science Citation Index-Expanded of Thomson-Reuters. By contrast, a team at Elsevier had argued in a report of the Royal Society, and on the basis of Scopus data, for a possible cross-over as early as 2013 (Refs 3.4).
Figure 1 – Predicted cross-over between the USA and China based on the new Scopus data; confidence intervals at the 95%-level. (SPSS, v.18.) Sources: Moed et al. (2011)1; the open circles are from Leydesdorff (2011)2.
The new analysis additionally clarifies why the linear fit for the US data remains poor (R2 = 0.71) — it is because of problems with this data. However, the fit for China is not different from previously reported studies (R2 = 0.97). Using the Science Citation Index (WoS v.5), one can find more precise fits and therefore a higher reliability for the prediction of a cross-over occurring after 2020.As noted2, these longer-term predictions are unlikely to be valid because of decreasing marginal returns in competitive markets. The metrics are embedded in a long-standing debate which I first entered in 1987 (see Refs 5,6). Given the new data, the prediction in the report of the Royal Society that the cross-over in the Scopus database would take place as early as 2013 can be postponed by approximately two years.
References:1. Moed, H. F., Plume, A., Aisati, M, & Berkvens, P. (2011). Is science in your country declining? Or is your country becoming a scientific super power, and how quickly? Research Trends, Issue 25.
2. Leydesdorff, L. (2011). World shares of publications of the USA, EU-27, and China compared and predicated using the new interface of the Web-of-Science versus Scopus. arXiv:1110.1802v2 [cs.DL].
3. The Royal Society (2011) Knowledge, Networks and Nations: Global Scientific Collaboration in the 21st Century.
4. Plume, A. (2011) Tipping the balance: The rise of China as a science superpower. Research Trends, Issue 22.
5. Callon, M. & Leydesdorff, L. (1987). La recherche française est-elle en bonne santé? La Recherche Vol. 18, pp. 412–419.
6. Shelton, R. D., & Leydesdorff, L. (in press). Publish or patent: bibliometric evidence for empirical trade-offs in national funding strategies. Journal of the American Society for Information Science and Technology.