FreqyWM: Frequency WaterMarking for the New Data Economy
Fecha
2022Resumen
We present a novel technique for modulating the appearance
frequency of a few tokens within a dataset for encoding an
invisible watermark that can be used to protect ownership
rights upon data. We develop optimal as well as fast heuristic
algorithms for creating and verifying such watermarks.
We also demonstrate the robustness of our technique against
various attacks and derive analytical bounds for the false positive
probability of erroneously “detecting” a watermark on
a dataset that does not carry it. Our technique is applicable
to both single dimensional and multidimensional datasets, is
independent of token type, and can be used in a variety of use
cases that involve buying and selling data in contemporary
data marketplaces.