Methodology

The Ranking Web of World Research Centers formally and explicitly adheres to the Berlin Principles of Higher Education Institutions. The ultimate aim is the continuous improvement and refinement of the methodologies according to a set of agreed principles of good practices.

0) Background of the project.

The “Ranking Web of World Research Centers” is an initiative of the Cybermetrics Lab, a research group of the Institute of Public Goods and Policies (IPP), part of the National Research Council (CSIC), the largest public research body in Spain.

Cybermetrics Lab is devoted to the quantitative analysis of the Internet and Web contents specially those related to the processes of generation and scholarly communication of scientific knowledge. This is a new emerging discipline that has been called Cybermetrics (our team developed and publishes the free electronic journal Cybermetrics since 1997) or Webometrics.

Cybermetrics electronic journal scientometrics bibliometrics webometrics

With these rankings we intend to provide extra motivation to researchers worldwide for publishing more and better scientific content on the Web, making it available to colleagues and people wherever they are located.

The "Ranking Web of World Research Centers" was officially launched in 2008, and it is updated every 6 months (data collected in January and July and published one month later). The Web indicators used are based and correlated with traditional scientometric and bibliometric indicators and the goal of the project is to convince academic and political communities of the importance of the web publication not only for dissemination of the academic knowledge but for measuring scientific activities, performance and impact too.

A) Purposes and Goals of Rankings

1. Assessment of higher education (processes, and outputs) in the Web. The Web indicators and we are already publishing comparative analysis with similar initiatives. But the current objective of the Ranking is to promote Web publication by research centers, evaluating the commitment to the electronic distribution  of these organizations and to fight a very concerning academic digital divide which is evident even among world institutions from developed countries. However, even when we do not intend to assess research centers performance solely on the basis of their web output, the Ranking Web is measuring a wider range of activities than the current generation of bibliometric indicators that focuses only in the activities of scientific elite.

2. Ranking purpose and target groups. The Ranking Web is measuring the volume, visibility and impact of the web pages published by research centers, with special emphasis in the scientific output (referred papers, conference contributions, pre-prints, monographs, thesis, reports, …) but also taking into account other materials (courseware, seminars or workshops documentation, digital libraries, databases, multimedia, personal pages, …) and the general information on the institution, their departments, research groups or supporting services and people working or attending courses.
There is a direct target group for the Ranking which are the university authorities. If the web performance of an institution is below the expected position according to their academic excellence, they should reconsider their web policy, promoting substantial increases in the volume and quality of their electronic publications.
Faculty members are indirect target groups as we expect that in a near future the web information could be as important as other bibliometric and scientometric indicators for the evaluation of the scientific performance of scholars and their research groups.
Finally, candidate students should not used this data as the sole guide for choosing a research center, although a Top position means that the institution has a policy that encourages new technologies and it has resources for their adoption.

3. Diversity of institutions: Missions and goals of the institutions. Quality measures for research-oriented institutions, for example, are quite different from those that are appropriate for institutions that provide broad access to underserved communities. Institutions that are being ranked and the experts that inform the ranking process should be consulted often.

4. Information sources and interpretation of the data provided. Access to the Web information is done mainly through search engines. These intermediaries are free, universal, and very powerful even when considering their shortcomings (coverage limitations and biases, lack of transparency, commercial secrets and strategies, irregular behaviour). Search engines are key for measuring visibility and impact of university’s websites.
There are a limited number of sources that can be useful for webometric purposes: 7 general search engines (Google*, Yahoo Search*, Live (MSN) Search*, Exalead*, Ask (Teoma), Gigablast and Alexa) and 2 specialised scientific databases (Google Scholar* and Live Academic). All of them have very large (huge) independent databases, but due to the availability of their data collection procedures (Apis), only those marked with asterisk are used in compiling the Ranking Web.

5. Linguistic, cultural, economic, and historical contexts. The project intends to have true global coverage, not narrowing the analysis to a few hundreds of institutions (world-class research centers) but including as many organizations as possible. The only requirement in our international rankings is having an autonomous web presence with an independent web domain. This approach allows a larger number of institutions to monitor their current ranking and the evolution of this position after adopting specific policies and initiatives. Research Centers in developing countries have the opportunity to know precisely the indicators' threshold that marks the limit of the elite.
Current identified biases of the Ranking Web includes the traditional linguistic one (more than half of the internet users are English-speaking people), and a new disciplinary one (technology instead of biomedicine is at the moment the hot topic) Since in most cases the infrastructure (web space) and the connectivity to the Internet already exits , the economic factor is not considered a major limitation (at least for the Top 2000).

B) Design and Weighting of Indicators

6. Methodology used to create the rankings. The unit for analysis is the institutional domain, so only universities and research centres with an independent web domain are considered. If an institution has more than one main domain, two or more entries are used with the different addresses. About 5-10% of the institutions have no independent web presence, most of them located in developing countries.

See the entry of the current edition for the most updated methodology info

7. Relevance and validity of the indicators. The choice of the indicators was done according to several criteria (see note), some of them trying to catch quality and academic and institutional strengths but others intending to promote web publication and Open Access initiatives. The inclusion of the total number of pages is based on the recognition of a new global market for academic information, so the web is the adequate platform for the internationalization of the institutions. A strong and detailed web presence providing exact descriptions of the structure and activities of the university can attract new students and scholars worldwide . The number of external inlinks received by a domain is a measure that represents visibility and impact of the published material, and although there is a great diversity of motivations for linking, a significant fraction works in a similar way as bibliographic citation. The success of self-archiving and other repositories related initiatives can be roughly represented from rich file and Scholar data. The huge numbers involved with the pdf and doc formats means that not only administrative reports and bureaucratic forms are involved. PostScript and Powerpoint files are clearly related to academic activities.

8. Measure outcomes in preference to inputs whenever possible. Data on inputs are relevant as they reflect the general condition of a given establishment and are more frequently available. Measures of outcomes provide a more accurate assessment of the standing and/or quality of a given institution or program. We expect to offer a better balance in the future, but current edition intend to call the attention to incomplete strategies, inadequate policies and bad practices in web publication before attempting a more complete scenario.

9. Weighting the different indicators: Current and future evolution. The current rules for ranking indicators including the described weighting model has been tested and published in scientific papers. More research is still done on this topic, but the final aim is to develop a model that includes additional quantitative data, especially bibliometric and scientometric indicators.

C) Collection and Processing of Data

10. Ethical standards. We identified some relevant biases in the search engines data including under-representation of some countries and languages. As the behaviour is different for each engine, a good practice consists of combining results from several sources. Any other mistake or error is unintentional and it should not affect the credibility of the ranking. Please contact us if you think the ranking is not objective and impartial in any way.
11. Audited and verifiable data. The only source for the data of the Webometrics Ranking is a small set of globally available, free access search engines. All the results can be duplicated according to the describing methodologies taking into account the explosive growth of the web contents, their volatility and the irregular behaviour of the commercial engines.
12. Data collection. Data are collected during the same week, in two consecutive rounds for each strategy, being selected the higher value. Every website under common institutional domain is explored, but no attempt has been done to combine contents or links from different domains.
13. Quality of the ranking processes. After automatic collection of data, positions are checked manually and compared with previous editions. Some of the processes are duplicated and new expertise is added from a variety of sources. Pages that linked to the Webometrics Ranking are explored and comments from blogs and other fora are taken into account. Finally, our mailbox receives a lot of requests and suggestions that are acknowledged individually.
14. Organizational measures to enhance credibility. The ranking results and methodologies are discussed in scientific journals, and presented in international conferences. We expect international advisory or even supervisory bodies to take part in future developments of the ranking.

D) Presentation of Ranking Results

15. Display of data and factors involved. The published tables show all the Web indicators used in a very synthetic and visual way. Rankings are provided not only from a central Top 2000 classification but also considering several regional rankings for comparative purposes.
16. Updating and error reducing. The listings are offered from asp dynamic pages build on several databases that can be corrected when errors or typos are detected.

Coments welcomed

Our group thanks the comments, suggestions and proposals than can be useful for improving this website. We try to maintain an objective position on the quantitative data provided but mistakes can occur. Please, take into account that merging, domain change or networks problems can affect the ranking of the institutions.

Currently the members of our team are Isidro F. AGUILLO, José Luis ORTEGA, Mario FERNÁNDEZ (Webmaster) and Helena ZAMORA.

For more information please contact:

Isidro F. Aguillo
CCHS - CSIC
Albasanz, 26-28
28037 Madrid. SPAIN

Notes:

- Aguillo, I. F.; Granadino, B.; Ortega, J. L.; Prieto, J. A. (2006). Scientific research activity and communication measured with cybermetric indicators. Journal of the American Society for the Information Science and Technology, 57(10): 1296 - 1302.

- Wouters, P.; Reddy, C. & Aguillo, I. F. (2006). On the visibility of information on the Web: an exploratory experimental approach. Research Evaluation, 15(2):107-115.

- Ortega, J L; Aguillo, I.F.; Prieto, JA. (2006). Longitudinal Study of Contents and Elements in the Scientific Web environment. Journal of Information Science, 32(4):344-351.

- Kretschmer, H. & Aguillo, I. F. (2005).New indicators for gender studies in Web networks. Information Processing & Management, 41 (6): 1481-1494.

- Aguillo, I. F.; Granadino, B.; Ortega, J.L. & Prieto, J.A. (2005). What the Internet says about Science. The Scientist, 19(14):10, Jul. 18, 2005.

- Kretschmer, H. & Aguillo, I. F. (2004). Visibility of collaboration on the Web. Scientometrics, 61(3): 405-426.

- Cothey V, Aguillo IF & Arroyo N (2006). Operationalising “Websites”: lexically, semantically or topologically?. Cybermetrics, 10(1): Paper 4. http://www.cindoc.csic.es/cybermetrics/articles/v10i1p4.html