Estimation Efficiency Under Privacy Constraints

Citation:

Shahab Asoodeh, Mario Diaz, Fady Alajaji, and Tamas Linder. 2019. “Estimation Efficiency Under Privacy Constraints.” IEEE Transaction on Information Theory, 65, 3, Pp. 1512 - 1534. Publisher's Version
t-it19.pdf548 KB

Abstract:

We investigate the problem of estimating a random variable Y under a privacy constraint dictated by another correlated random variable X. When X and Y are discrete, we express the underlying privacy-utility tradeoff in terms of the privacy-constrained guessing probability h(PXY, epsilon), and the maximum probability Pc(Y|Z) of correctly guessing Y given an auxiliary random variable Z, where the maximization is taken over all P Z|Y ensuring that Pc(X|Z) ≤ epsilon for a given privacy threshold epsilon ≥ 0. We prove that h(PXY, epsilon) is concave and piecewise linear in epsilon, which allows us to derive its expression in closed form for any epsilon when X and Y are binary. In the non-binary case, we derive h(PXY, epsilon) in the high-utility regime (i.e., for sufficiently large, but nontrivial, values of epsilon) under the assumption that Y and Z have the same alphabets. We also analyze the privacy-constrained guessing probability for two scenarios in which X, Y, and Z are binary vectors. When X and Y are continuous random variables, we formulate the corresponding privacy-utility tradeoff in terms of sENSR(PXY, epsilon), the smallest normalized minimum mean squared-error (MMSE) incurred in estimating Y from a Gaussian perturbation Z. Here, the minimization is taken over a family of Gaussian perturbations Z for which the mmse of f(X) given Z is within a factor (1-epsilon) from the variance of f(X) for any non-constant real-valued function f. We derive tight upper and lower bounds for sENSR when Y is Gaussian. For general absolutely continuous random variables, we obtain a tight lower bound for sENSR(PXY , epsilon) in the high privacy regime, i.e., for small epsilon.