摘要 :
We present a simple method to approximate the Fisher-Rao distance between multivariate normal distributions based on discretizing curves joining normal distributions and approximating the Fisher-Rao distances between successive ne...
展开
We present a simple method to approximate the Fisher-Rao distance between multivariate normal distributions based on discretizing curves joining normal distributions and approximating the Fisher-Rao distances between successive nearby normal distributions on the curves by the square roots of their Jeffreys divergences. We consider experimentally the linear interpolation curves in the ordinary, natural, and expectation parameterizations of the normal distributions, and compare these curves with a curve derived from the Calvo and Oller's isometric embedding of the Fisher-Rao d-variate normal manifold into the cone of ( d + 1 ) × ( d + 1 ) symmetric positive-definite matrices. We report on our experiments and assess the quality of our approximation technique by comparing the numerical approximations with both lower and upper bounds. Finally, we present several information-geometric properties of Calvo and Oller's isometric embedding.
收起
摘要 :
Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric ...
展开
Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α-divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.
收起
摘要 :
We prove that all
$f$
-divergences between univariate Cauchy distributions are symmetric. Furthermore, those
$f$
-divergences can be calculated as strictly increasing scalar functions of the chi-square divergence. We report ...
展开
We prove that all
$f$
-divergences between univariate Cauchy distributions are symmetric. Furthermore, those
$f$
-divergences can be calculated as strictly increasing scalar functions of the chi-square divergence. We report a criterion which allows one to expand
$f$
-divergences as converging series of power chi divergences, and exemplifies the technique for some
$f$
-divergences between Cauchy distributions. In contrast with the univariate case, we show that the
$f$
-divergences between multivariate Cauchy densities are in general asymmetric although symmetric when the Cauchy scale matrices coincide. Then we prove that the square roots of the Kullback-Leibler and Bhattacharyya divergences between univariate Cauchy distributions yield complete metric spaces. Finally, we show that the square root of the Kullback-Leibler divergence between univariate Cauchy distributions can be isometrically embedded into a Hilbert space.
收起
摘要 :
Evaluating the performance of Bayesian classification in a high-dimensional random tensor is a fundamental problem, usually difficult and under-studied. In this work, we consider two Signal to Noise Ratio (SNR)-based binary classi...
展开
Evaluating the performance of Bayesian classification in a high-dimensional random tensor is a fundamental problem, usually difficult and under-studied. In this work, we consider two Signal to Noise Ratio (SNR)-based binary classification problems of interest. Under the alternative hypothesis, i.e., for a non-zero SNR, the observed signals are either a noisy rank- R tensor admitting a Q -order Canonical Polyadic Decomposition (CPD) with large factors of size N q × R , i.e., for 1 ≤ q ≤ Q , where R , N q → ∞ with R 1 / q / N q converge towards a finite constant or a noisy tensor admitting TucKer Decomposition (TKD) of multilinear ( M 1 , … , M Q ) -rank with large factors of size N q × M q , i.e., for 1 ≤ q ≤ Q , where N q , M q → ∞ with M q / N q converge towards a finite constant. The classification of the random entries (coefficients) of the core tensor in the CPD/TKD is hard to study since the exact derivation of the minimal Bayes’ error probability is mathematically intractable. To circumvent this difficulty, the Chernoff Upper Bound (CUB) for larger SNR and the Fisher information at low SNR are derived and studied, based on information geometry theory. The tightest CUB is reached for the value minimizing the error exponent, denoted by s ? . In general, due to the asymmetry of the s -divergence, the Bhattacharyya Upper Bound (BUB) (that is, the Chernoff Information calculated at s ? = 1 / 2 ) cannot solve this problem effectively. As a consequence, we rely on a costly numerical optimization strategy to find s ? . However, thanks to powerful random matrix theory tools, a simple analytical expression of s ? is provided with respect to the Signal to Noise Ratio (SNR) in the two schemes considered. This work shows that the BUB is the tightest bound at low SNRs. However, for higher SNRs, the latest property is no longer true.
收起
摘要 :
With the recent advances in computed tomography and magnetic resonance devices, cross-sectional images are now commonly used for diagnosis. However, how contours between cross-sections should be connected is often ambiguous. In th...
展开
With the recent advances in computed tomography and magnetic resonance devices, cross-sectional images are now commonly used for diagnosis. However, how contours between cross-sections should be connected is often ambiguous. In this paper, we propose an algorithm that enumerates all possible cases of the correspondence of contours. This is useful for achieving fully automatic interpolation of contours, although our current implementation still requires some degree of manual interaction.
收起
摘要 :
The informational energy of Onicescu is a positive quantity that measures the amount of uncertainty of a random variable. However, contrary to Shannon’s entropy, the informational energy is strictly convex and increases when rand...
展开
The informational energy of Onicescu is a positive quantity that measures the amount of uncertainty of a random variable. However, contrary to Shannon’s entropy, the informational energy is strictly convex and increases when randomness decreases. We report a closed-form formula for Onicescu’s informational energy and its associated correlation coefficient when the probability distributions belong to an exponential family. We show how to instantiate the generic formula for several common exponential families. Finally, we discuss the characterization of valid thermodynamic process trajectories on a statistical manifold by enforcing that the entropy and the informational energy shall vary in opposite directions.
收起
摘要 :
We present several efficient dynamic data structures for point-enclosure queries, involving convex fat objects in R~2 or R~3. Our planar structures are actually fitted for a more general class of objects - (β,δ)-covered objects ...
展开
We present several efficient dynamic data structures for point-enclosure queries, involving convex fat objects in R~2 or R~3. Our planar structures are actually fitted for a more general class of objects - (β,δ)-covered objects - which are not necessarily convex, see definition below. These structures are more efficient than alternative known structures, because they exploit the fatness of the objects. We then apply these structures to obtain efficient solutions to two problems: (i) finding a perfect containment matching between a set of points and a set of convex fat objects, and (ii) finding a piercing set for a collection of convex fat objects, whose size is optimal up to some constant factor.
收起
摘要 :
A lattice Gaussian distribution of given mean and covari-ance matrix is a discrete distribution supported on a lattice maximizing Shannon's entropy under these mean and covariance constraints. Lattice Gaussian distributions find a...
展开
A lattice Gaussian distribution of given mean and covari-ance matrix is a discrete distribution supported on a lattice maximizing Shannon's entropy under these mean and covariance constraints. Lattice Gaussian distributions find applications in cryptography and in machine learning. The set of Gaussian distributions on a given lattice can be handled as a discrete exponential family whose partition function is related to the Riemann theta function. In this paper, we first report a formula for the Kullback-Leibler divergence between two lattice Gaussian distributions and then show how to efficiently approximate it numerically either via Renyi's α-divergences or via the projective γ-divergences. We illustrate how to use the Kullback-Leibler divergence to calculate the Chernoff information on the dually flat structure of the manifold of lattice Gaussian distributions.
收起
摘要 :
Given a learning sample, we focus on the hardness of finding monomials having low error, inside the interval bounded below by the smallest error achieved by a monomial (the best rule), and bounded above by the error of the default...
展开
Given a learning sample, we focus on the hardness of finding monomials having low error, inside the interval bounded below by the smallest error achieved by a monomial (the best rule), and bounded above by the error of the default class (the poorest rule). It is well-known that when its lower bound is zero, it is an easy task to find, in linear time, a monomial with zero error. What we prove is that when this bound is not zero, regardless of the location of the default class in (0,1/2), it becomes a huge complexity burden to beat significantly the default class. In fact, under some complexity-theoretical assumptions, it may already be hard to beat the trivial approximation ratios, even when relaxing the time complexity constraint to be quasi-polynomial or sub-exponential. Our results also hold with uniform weights over the examples.
收起
摘要 :
In this paper, we first extend the result of Ali and Silvey [J R Stat Soc Ser B, 28:131-142, 1966] who proved that any f-divergence between two isotropic multivariate Gaussian distributions amounts to a corresponding strictly incr...
展开
In this paper, we first extend the result of Ali and Silvey [J R Stat Soc Ser B, 28:131-142, 1966] who proved that any f-divergence between two isotropic multivariate Gaussian distributions amounts to a corresponding strictly increasing scalar function of their corresponding Mahalanobis distance. We give sufficient conditions on the standard probability density function generating a multivariate location family and the function generator f in order to generalize this result. This property is useful in practice as it allows to compare exactly f-divergences between densities of these location families via their corresponding Mahalanobis distances, even when the f-divergences are not available in closed-form as it is the case, for example, for the Jensen-Shannon divergence or the total variation distance between densities of a normal location family. Second, we consider f-divergences between densities of multivariate scale families: We recall Ali and Silvey's result that for normal scale families we get matrix spectral divergences, and we extend this result to densities of a scale family.
收起