中国科学技术信息研究所--国家工程技术数字图书馆

1. NON-MANIPULABLE PARTITIONING

[机翻] 不可操作分区

[期刊] CONAL DUDDY JUAN PEROTE-PENA ASHLEY PIGGINS 《New Mathematics and Natural Computation》 2012年8卷2期共10页

摘要 : Consider the following social choice problem. A group of individuals seek to classify the elements of X as belonging in one of two sets. The individuals may disagree as to how the elements of X should be classified, and so an aggr... 展开

关键词 : partitions aggregation non-manipulation

2. Partition and characterization of cadmium on different particle-size aggregates in Chinese Phaeozem.

[期刊] Guo GuanLin Zhang Yu Zhang Chao Wang ShiJie Li FaSheng Yan ZengGuang 《Geoderma: An International Journal of Soil Science》 2013年200/201卷共6页

摘要 : Organic-mineral complexations can be isolated from bulk soil by physical disaggregation followed by density fractionation for further examination of the patchy nature of aggregates distributed in soil. Phaeozem, which is a specifi... 展开

关键词 : Phaeozem Particle-size Aggregate Cadmium Partition Characterization

原文获取

3. A comparative evaluation of aggregation methods for machine learning over vertically partitioned data

[期刊] Trevizan, Bernardo Chamby-Diaz, Jorge Bazzan, Ana L. C. Recamonde-Mendoza, Mariana 《Expert systems with applications》 2020年152卷Aug.期共19页

摘要 : It is increasingly common applications where data are naturally generated in a distributed fashion, especially after the emergence of technologies like the Internet of Things (IoT). In sensor networks, in collaborative health or g... 展开 It is increasingly common applications where data are naturally generated in a distributed fashion, especially after the emergence of technologies like the Internet of Things (IoT). In sensor networks, in collaborative health or genomic projects, in credit risk analysis, among other domains, distinct features are collected from multiple sources, including the use of social media and mobile applications, and due to privacy concerns or communication costs, may not be shared among sites. This scenario of vertical data partitioning poses challenges to traditional machine learning (ML) approaches, as classical algorithms are designed to learn from the complete set of features. A common strategy is to combine predictions from local models trained at each site into a global model, and for this purpose, several aggregation methods have been proposed. In this work we tackle a gap within the related literature, performing a comparative evaluation of elementary and meta-learning-based aggregation methods to reveal their strengths and weakness for 46 datasets with varied characteristics. We show that no method outperforms its counterparts in all domains, emphasizing the need for experimental comparison to ensure a good choice in the domain of interest. Moreover, our experiments provide the first insights into the relations between datasets' properties and aggregators' performance. We show that for low class imbalance and a good instance-to-feature ratio, almost all aggregation methods tend to perform well. The silhouette coefficient (reflecting class separability) and class imbalance coefficient are the most influential properties on aggregators' performance, thus we recommend their analysis in the first step of the methodological design. We found that arithmetic-based methods are not suitable for datasets with poor class separability and a large number of classes, whereas meta-learning approaches are less sensitive for datasets with silhouette coefficient close to 0. Our analyses were summarized as classification and regression trees, which have the impact to serve as practical tools for future research. Taken together, our findings give rise to interesting applications in the domain of intelligent systems, especially regarding their potential to reduce the burden of vast experimental comparisons when training ML models with feature-partitioned data. (C) 2020 Elsevier Ltd. All rights reserved. 收起

关键词 : Vertical data partitioning Distributed machine learning Classification Predictions aggregation Attribute-partitioned data

4. Similarity joins for high-dimensional data using Spark

[机翻] 基于Spark的高维数据相似连接

[期刊] Chuitian Rong Xiaohai Cheng Ziliang Chen Na Huo 《Concurrency and computation: practice and experience》 2019年31卷20期共17页

摘要 : Similarity join on high-dimensional data is a primitive operation. It is used to find all data pairs that with distance no more than 𝜖 from the given data set according to a specific distance measure. As the data set scale and ... 展开

关键词 : high-dimensional data piecewise aggregation similarity join symbolic aggregation Spark vertical partition

5. Power partitioned neutral aggregation operators for T-spherical fuzzy sets: An application to H_2 refuelling site selection

[期刊] Kaushik Debnath Sankar Kumar Roy 《Expert systems with applications》 2023年216卷Apr.期共17页

摘要 : T-spherical fuzzy set (T-SFS) is emerged as one of the effective tools for dealing uncertainty in decision-making process. Whereas, power aggregation operators help us in normalizing the impact of extreme values and capture the in... 展开

关键词 : T-spherical fuzzy sets Power aggregation Partitioned aggregation Neutral aggregation Score function H_2 refuelling station

6. A review of the occurrence of inter-colony segregation of seabird foraging areas and the implications for marine environmental impact assessment

[期刊] Conolly, Georgia Bolton, Mark Caldow, Richard Carroll, Matthew Wakefield, Ewan D. 《IBIS》 2019年161卷2期共19页

摘要 : Understanding the determinants of species' distributions is a fundamental aim in ecology and a prerequisite for conservation but is particularly challenging in the marine environment. Advances in bio-logging technology have result... 展开 Understanding the determinants of species' distributions is a fundamental aim in ecology and a prerequisite for conservation but is particularly challenging in the marine environment. Advances in bio-logging technology have resulted in a rapid increase in studies of seabird movement and distribution in recent years. Multi-colony studies examining the effects of intra- and inter-colony competition on distribution have found that several species exhibit inter-colony segregation of foraging areas, rather than overlapping distributions. These findings are timely given the increasing rate of human exploitation of marine resources and the need to make robust assessments of likely impacts of proposed marine developments on biodiversity. Here we review the occurrence of foraging area segregation reported by published tracking studies in relation to the density-dependent hinterland (DDH) model, which predicts that segregation occurs in response to inter-colony competition, itself a function of colony size, distance from the colony and prey distribution. We found that inter-colony foraging area segregation occurred in 79% of 39 studies. The frequency of occurrence was similar across the four seabird orders for which data were available, and included species with both smaller (10-100 km) and larger (100-1000 km) foraging ranges. Many predictions of the DDH model were confirmed, with examples of segregation in response to high levels of inter-colony competition related to colony size and proximity, and enclosed landform restricting the extent of available habitat. Moreover, as predicted by the DDH model, inter-colony overlap tended to occur where birds aggregated in highly productive areas, often remote from all colonies. The apparent prevalence of inter-colony foraging segregation has important implications for assessment of impacts of marine development on protected seabird colonies. If a development area is accessible from multiple colonies, it may impact those colonies much more asymmetrically than previously supposed. Current impact assessment approaches that do not consider spatial inter-colony segregation will therefore be subject to error. We recommend the collection of tracking data from multiple colonies and modelling of inter-colony interactions to predict colony-specific distributions. 收起

关键词 : aggregation central-place foraging competition space overlap partition

原文获取

7. Got Loss? Get zOVN!

[机翻] 有损失吗？抓住佐夫！

[期刊] Daniel Crisan Robert Birke Gilles Cressier Cyriel Minkenberg Mitch Gusat 《Computer communication review》 2013年43卷4期共12页

摘要 : Datacenter networking is currently dominated by two major trends. One aims toward lossless, flat layer-2 fabrics based on Converged Enhanced Ethernet or InfiniBand, with benefits in efficiency and performance. The other targets fl... 展开

关键词 : Datacenter networking virtualization overlay networks lossless Partition-Aggregate

8. The Slavic suffix -in/-yn as partition shifter

[期刊] Olga Kagan 《Natural language semantics》 2024年32卷1期共29页

摘要 : This paper investigates lexical mass-to-count and count-to-mass operators in Slavic languages, primarily Russian and Ukrainian, by exploring the distribution and semantic contribution of the suffix -in/-yn. The focus is on two use... 展开

关键词 : Mass-count distinction Singulativity Aggregate nouns Partition Material part

9. Reliability measures for two-part partition of states for aggregated Markov repairable systems

[机翻] 聚合Markov可修系统两部分状态划分的可靠性测度

[期刊] Lirong Cui Shijia Du Aufu Zhang 《Annals of Operations Research》 2014年212卷Jan.期共22页

摘要 : Three models for the aggregated stochastic processes based on an underlying continuous-time Markov repairable system are developed in which two-part partition of states is used. Several availability measures such as interval avail... 展开

关键词 : Two-part partition Aggregation Repairable systems Availability measures Distributions

10. FastRAQ: A Fast Approach to Range-Aggregate Queries in Big Data Environments

[机翻] FastRAQ：大数据环境中范围聚合查询的快速方法

[期刊] Yun, Xiaochun Wu, Guangjun Zhang, Guangyan Li, Keqin Wang, Shupeng 《Cloud Computing, IEEE Transactions on》 2015年3卷2期共13页

摘要 : Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges. Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environment... 展开

关键词 : Balanced partition big data multidimensional histogram range-aggregate query