A unified approach to mapping and clustering of bibliometric networks
Introduction
In bibliometric and scientometric research, a lot of attention is paid to the analysis of networks of, for example, documents, keywords, authors, or journals. Mapping and clustering techniques are frequently used to study such networks. The aim of these techniques is to provide insight into the structure of a network. The techniques are used to address questions such as:
- •
What are the main topics or the main research fields within a certain scientific domain?
- •
How do these topics or these fields relate to each other?
- •
How has a certain scientific domain developed over time?
To satisfactorily answer such questions, mapping and clustering techniques are often used in a combined fashion. Various different approaches are possible. One approach is to construct a map in which the individual nodes in a network are shown and to display a clustering of the nodes on top of the map, for example by marking off areas in the map that correspond with clusters (e.g., McCain, 1990, White and Griffith, 1981) or by coloring nodes based on the cluster to which they belong (e.g., Leydesdorff and Rafols, 2009, Van Eck et al., in press). Another approach is to first cluster the nodes in a network and to then construct a map in which clusters of nodes are shown. This approach is for example taken in the work of Small et al. (e.g., Small, Sweeney, & Greenlee, 1985) and in earlier work of our own institute (e.g., Noyons, Moed, & Van Raan, 1999). A third approach is to first construct a map in which the individual nodes in a network are shown and to then cluster the nodes based on their coordinates in the map (e.g., Boyack et al., 2005, Klavans and Boyack, 2006).
In the bibliometric and scientometric literature, the most commonly used combination of a mapping and a clustering technique is the combination of multidimensional scaling and hierarchical clustering (for early examples, see McCain, 1990, Peters and Van Raan, 1993, Small et al., 1985, White and Griffith, 1981). However, various alternatives to multidimensional scaling and hierarchical clustering have been introduced in the literature, especially in more recent work, and these alternatives are also often used in a combined fashion. A popular alternative to multidimensional scaling is the mapping technique of Kamada and Kawai (1989); (see e.g. Leydesdorff and Rafols, 2009, Noyons and Calero-Medina, 2009), which is sometimes used together with the pathfinder network technique (Schvaneveldt, Dearholt, & Durso, 1988; see e.g. Chen, 1999, de Moya-Anegón et al., 2007, White, 2003). Two other alternatives to multidimensional scaling are the VxOrd mapping technique (e.g., Boyack et al., 2005, Klavans and Boyack, 2006) and our own VOS mapping technique (e.g., Van Eck et al., in press). Factor analysis, which has been used in a large number of studies (e.g., de Moya-Anegón et al., 2007, Leydesdorff and Rafols, 2009, Zhao and Strotmann, 2008), may be seen as a kind of clustering technique and, consequently, as an alternative to hierarchical clustering. Another alternative to hierarchical clustering is clustering based on the modularity function of Newman and Girvan (2004); (see e.g. Wallace et al., 2009, Zhang et al., 2010).
As we have discussed, mapping and clustering techniques have a similar objective, namely to provide insight into the structure of a network, and the two types of techniques are often used together in bibliometric and scientometric analyses. However, despite their close relatedness, mapping and clustering techniques have typically been developed separately from each other. This has resulted in techniques that have little in common. That is, mapping and clustering techniques are based on different ideas and rely on different assumptions. In our view, when a mapping and a clustering technique are used together in the same analysis, it is generally desirable that the techniques are based on similar principles as much as possible. This enhances the transparency of the analysis and helps to avoid unnecessary technical complexity. Moreover, by using techniques that rely on similar principles, inconsistencies between the results produced by the techniques can be avoided. In this paper, we propose a unified approach to mapping and clustering of bibliometric networks. We show how a mapping and a clustering technique can both be derived from the same underlying principle. In doing so, we establish a relation between on the one hand the VOS mapping technique (Van Eck and Waltman, 2007, Van Eck et al., in press) and on the other hand clustering based on a weighted and parameterized variant of the well-known modularity function of Newman and Girvan (2004).
The paper is organized as follows. We first present our proposal for a unified approach to mapping and clustering. We then discuss how the proposed approach is related to earlier work published in the physics literature. Next, we illustrate an application of the proposed approach by producing a combined mapping and clustering of frequently cited publications in the field of information science. Finally, we summarize the conclusions of our research. Some technical issues are elaborated in appendices.
Section snippets
Mapping and clustering: a unified approach
Consider a network of n nodes. Suppose we want to create a mapping or a clustering of these nodes. cij denotes the number of links (e.g., co-occurrence links, co-citation links, or bibliographic coupling links) between nodes i and j (cij = cji ≥ 0). sij denotes the association strength of nodes i and j (Van Eck & Waltman, 2009) and is given bywhere ci denotes the total number of links of node i and m denotes the total number of links in the network, that is,In
Related work
Our unified approach to mapping and clustering is related to earlier work published in the physics literature. Here we summarize the most closely related work.
The above result showing how mapping and clustering can be performed in a unified and consistent way resembles to some extent a result derived by Noack (2009). Noack defined a parameterized objective function for a class of mapping techniques (referred to as force-directed layout techniques by Noack). This class of mapping techniques
Illustration of the proposed approach
We now illustrate an application of our unified approach to mapping and clustering. In Fig. 1, we show a combined mapping and clustering of the 1242 most frequently cited publications that appeared in the field of information science in the period 1999–2008.1 The mapping and the clustering were produced using our unified
Conclusions
Mapping and clustering are complementary to each other. Mapping can be used to obtain a fairly detailed picture of the structure of a bibliometric network. For practical purposes, however, the picture will usually be restricted to just two dimensions. Hence, relations in more than two dimensions will usually not be visible. Clustering, on the other hand, does not suffer from dimensional restrictions. However, the price to be paid is that clustering works with binary rather than continuous
References (40)
Visualising semantic spaces and author co-citation networks in digital libraries
Information Processing and Management
(1999)- et al.
Community structure of the physical review citation network
Journal of Informetrics
(2010) Community detection in graphs
Physics Reports
(2010)- et al.
An algorithm for drawing general undirected graphs
Information Processing Letters
(1989) - et al.
Communities, knowledge creation, and information diffusion
Journal of Informetrics
(2009) - et al.
Co-word-based science maps of chemical engineering. Part II: Representations by combined clustering and multidimensional scaling
Research Policy
(1993) - et al.
Graph theoretic foundations of pathfinder networks
Computers and Mathematics with Applications
(1988) - et al.
Subject clustering analysis based on ISI category classification
Journal of Informetrics
(2010) Changes in the LIS research front: Time-sliced cocitation analyses of LIS journal articles 1990–2004
Journal of the American Society for Information Science and Technology
(2007)- et al.
Modern multidimensional scaling
(2005)
Mapping the backbone of science
Scientometrics
The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis
Journal of the American Society for Information Science and Technology
Visualizing the marrow of science
Journal of the American Society for Information Science and Technology
Resolution limit in community detection
Proceedings of the National Academy of Sciences
Graph drawing by force-directed placement
Software: Practice and Experience
Detecting modules in dense weighted networks with the Potts method
Journal of Statistical Mechanics
Quantitative evaluation of large maps of science
Scientometrics
Limited resolution in complex network community detection with Potts model approach
European Physical Journal B
A global map of science based on the ISI subject categories
Journal of the American Society for Information Science and Technology
Mapping authors in intellectual space: A technical overview
Journal of the American Society for Information Science
Cited by (1289)
Intersecting reinforcement learning and deep factor methods for optimizing locality and globality in forecasting: A review
2024, Engineering Applications of Artificial IntelligenceA systematic review and bibliometric analysis on agribusiness gaps in emerging markets
2024, Research in GlobalizationConservation and development of the historic garden in a landscape context: A systematic literature review
2024, Landscape and Urban PlanningLandscape of epilepsy research: Analysis and future trajectory
2024, Interdisciplinary Neurosurgery: Advanced Techniques and Case ManagementSupply chain management under cap-and-trade regulation: A literature review and research opportunities
2024, International Journal of Production Economics