摘要 :
Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflic...
展开
Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflicting results. This has led to the use of clustering comparison measures to quantify the degree of similarity between alternative clusterings. Existing measures, though, can be limited in their ability to assess similarity and sometimes generate unintuitive results. They also cannot be applied to compare clusterings which contain different data points, an activity which is important for scenarios such as data stream analysis. In this paper, we introduce a new clustering similarity measure, known as ADCO, which aims to address some limitations of existing measures, by allowing greater flexibility of comparison via the use of density profiles to characterize a clustering. In particular, it adopts a 'data mining style' philosophy to clustering comparison, whereby two clusterings are considered to be more similar, if they are likely to give rise to similar types of prediction models. Furthermore, we show that this new measure can be applied as a highly effective objective function within a new algorithm, known as MAXIMUS, for generating alternate clusterings.
收起
摘要 :
Meta-clustering is a popular approach for finding multiple clusterings in the dataset, taking a large number of base clusterings as input for further user navigation and refinement. However, the effectiveness of meta-clustering is...
展开
Meta-clustering is a popular approach for finding multiple clusterings in the dataset, taking a large number of base clusterings as input for further user navigation and refinement. However, the effectiveness of meta-clustering is highly dependent on the distribution of the base clusterings and open challenges exist with regard to its stability and noise tolerance. In addition, the clustering views returned may not all be relevant, hence there is open challenge on how to rank those clustering views. In this paper we propose a simple and effective filtering algorithm that can be flexibly used in conjunction with any meta-clustering method. In addition, we propose an unsupervised method to rank the returned clustering views. We evaluate the framework (rFILTA) on both synthetic and real-world datasets, and see how its use can enhance the clustering view discovery for complex scenarios.
收起
摘要 :
Ensemble clustering combines the results of multiple individual clustering methods for better results. Basically, all available clustering methods can be combined to produce final clusters. However, selecting a subset of optimal m...
展开
Ensemble clustering combines the results of multiple individual clustering methods for better results. Basically, all available clustering methods can be combined to produce final clusters. However, selecting a subset of optimal methods can reduce the complexity and increase the efficiency of ensemble clustering methods. This article examines the problem of selecting individual clustering methods to produce an ensemble hierarchical clustering method. Hierarchical clustering is a technique for grouping data at different scales by creating dendrograms. The aim is to select a subset of individual hierarchical clustering methods considering diversity and quality that can create an ensemble clustering method with minimal complexity. The proposed method consists of three main phases. The selection of a subset of individual hierarchical clustering methods is done in the first phase. In the second phase, the results of the selected clusters are re-clustered to create super-clusters. Super-clusters can combine clustering knowledge of different methods into one clustering form. Finally, the final clusters are formed by assigning each sample to a super-cluster with the shortest distance in the third phase. Experimental results on several datasets from the University of California Irvine (UCI) repository show that the proposed method performs better than the state-of-the-art algorithms.
收起
摘要 :
Abstract. In the field of pattern recognition, combining different classifiers into a robust classifier is a common approach
for improving classification accuracy. Recently, this trend has also been used to improve clustering perf...
展开
Abstract. In the field of pattern recognition, combining different classifiers into a robust classifier is a common approach
for improving classification accuracy. Recently, this trend has also been used to improve clustering performance especially
in non-hierarchical clustering approaches. Generally hierarchical clustering is preferred in comparison with the partitional
clustering for applications when the exact number of the clusters is not determined or when we are interested in finding the
relation between clusters. To the best of our knowledge clustering combination methods proposed so far are based on partitional
clustering and hierarchical clustering has been ignored.
In this paper, a new method for combining hierarchical clustering is proposed. In this method, in the first step the primary
hierarchical clustering dendrograms are converted to matrices. Then these matrices, which describe the dendrograms, are
aggregated (using the matrix summation operator) into a final matrix with which the final clustering is formed. The effectiveness
of different well known dendrogram descriptors and the one proposed by us for representing the dendrograms are evaluated and
compared. The results show that all these descriptor work well and more accurate results (hierarchy of clusters) are obtained
using hierarchical combination than combination of partitional clusterings.
收起
摘要 :
In the field of pattern recognition, combining different classifiers into a robust classifier is a common approach for improving classification accuracy. Recently, this trend has also been used to improve clustering performance es...
展开
In the field of pattern recognition, combining different classifiers into a robust classifier is a common approach for improving classification accuracy. Recently, this trend has also been used to improve clustering performance especially in non-hierarchical clustering approaches. Generally hierarchical clustering is preferred in comparison with the partitional clustering for applications when the exact number of the clusters is not determined or when we are interested in finding the relation between clusters. To the best of our knowledge clustering combination methods proposed so far are based on partitional clustering and hierarchical clustering has been ignored. In this paper, a new method for combining hierarchical clustering is proposed. In this method, in the first step the primary hierarchical clustering dendrograms are converted to matrices. Then these matrices, which describe the dendrograms, are aggregated (using the matrix summation operator) into a final matrix with which the final clustering is formed. The effectiveness of different well known dendrogram descriptors and the one proposed by us for representing the dendrograms are evaluated and compared. The results show that all these descriptor work well and more accurate results (hierarchy of clusters) are obtained using hierarchical combination than combination of partitional clusterings.
收起
摘要 :
This study examines how contextual, structural and functioning characteristics of industrial clusters influence their effectiveness. We develop a conceptual framework that identifies potential influencing factors, validate the fac...
展开
This study examines how contextual, structural and functioning characteristics of industrial clusters influence their effectiveness. We develop a conceptual framework that identifies potential influencing factors, validate the factors statistically, and estimate the factors' impact on cluster effectiveness. Our results show that among the important determinants of cluster effectiveness are long-term planning security and procedural trust among the cooperating firms (contextual conditions), formalized rules and sustainable structures (structural elements), and clear goals and tasks (functioning characteristics). However, the results also reveal that some determinants assessed as important in the literature do not seem to have a positive impact on effectiveness. Our results not only modify general assumptions in cluster research concerning the drivers of cluster effectiveness, but also assist firms and policy-makers in conceptualizing successful new clusters.
收起
摘要 :
In this article, the interdisciplinary science of clusters is discussed in general terms. Different types of clusters across vast scales of matter, energy, space, and time in the physical world are discussed. Specific examples of ...
展开
In this article, the interdisciplinary science of clusters is discussed in general terms. Different types of clusters across vast scales of matter, energy, space, and time in the physical world are discussed. Specific examples of clusters in chemistry and physics are used to illustrate various principles or models of clustering processes of atoms and molecules as well as to demonstrate the exquisite beauty and pattern of clusters and the clustering phenomena so ubiquitous in nature. Nowadays, "designer clusters" can be made with tailorable properties and used as "building blocks" to form supermolecules, or to construct large cluster-based hierarchical materials with tunable properties, or to fabricate cluster-based devices with specific functions, etc., thereby providing a materials base for nanotechnology. Clustering is a spontaneous self-assembly process and the similarity across scales reflects the intrinsic self-organization and self-similarity principle of the physical world. Geometry and symmetry transcend all clustering processes, in ordered as well as in disordered systems.
收起
摘要 :
The systemic development of clusters is still relevant for Russian economy, which is in urgent need for modernisation. Despite the gap to the leaders, the process of clustering continues in certain sectors and regions of contempor...
展开
The systemic development of clusters is still relevant for Russian economy, which is in urgent need for modernisation. Despite the gap to the leaders, the process of clustering continues in certain sectors and regions of contemporary Russia. From the point of view of solving the actual problems in Russia, the innovative territorial clusters (ITC) are recognised as the foreground ones, aimed at organisational and technological modernisation of the country. However, the overall indicators of cluster development do not represent a sustainable positive trend in 2008-2016. Assessment of clustering policy in Russia in line with the evolutionary approach emphasises the mechanisms of state support for both ITC already functioning and developing pilot ITC projects, including priority areas, dynamics and contradictions of support. Analyses results give impetus for conclusions on the urgent correction of the policy under consideration.
收起
摘要 :
The consensus clustering technique combines multiple clustering results without accessing the original data. Consensus clustering can be used to improve the robustness of clustering results or to obtain the clustering results from...
展开
The consensus clustering technique combines multiple clustering results without accessing the original data. Consensus clustering can be used to improve the robustness of clustering results or to obtain the clustering results from multiple data sources. In this paper, we propose a novel definition of the similarity between points and clusters. With an iterative process, such a definition of similarity can represent how a point should join or leave a cluster clearly, determine the number of clusters automatically, and combine partially overlapping clustering results. We also incorporate the concept of "clustering fragment" into our method for increased speed. The experimental results show that our algorithm achieves good performances on both artificial data and real data.
收起
摘要 :
The aim of this paper is to determine whether there exist age dependent differences in the orientation of clusters' activities. The literature depicts different approaches to the cluster evolution process, highlighting that cluste...
展开
The aim of this paper is to determine whether there exist age dependent differences in the orientation of clusters' activities. The literature depicts different approaches to the cluster evolution process, highlighting that clusters are subject to a life cycle that emphasizes different sets of activities in various stages of their development. These activities appear to follow a certain trajectory, whereby the successful completion of initial less-intensive activities stimulates a shift in focus to more demanding, long-term projects. The presented research verifies that clusters can pass through different stages of development, and examines in detail their preferences for jointly-undertaken activities. Research, conducted on a sample of clusters of different countries and ages, was carried out through the use of questionnaires and structured interviews with cluster managers. It is a sample of so-called organized clusters, which have their own internal structure and which are characterized by conscious development. The study identified common cluster activities in the following areas: networking, human resources, research and innovations, business cooperation and promotion, support activities, lobbying, etc. The preference of their implementation was also ascertained. In addition, the analyzed sample was divided into two categories according to cluster age, allowing for a comparison and differentiation of the level of implementation of joint activities between embryonic and established clusters. The evaluation of this research demonstrated that in the selected groups of activities, there was a statistically significant difference in terms of their level of implementation in clusters of various ages.
收起