Mostrar el registro sencillo del ítem
Try Before You Buy: a Practical Data Purchasing Algorithm for Real-World Data Marketplaces
dc.contributor.author | Andres, Santiago | |
dc.contributor.author | Laoutaris, Nikolaos | |
dc.date.accessioned | 2022-10-31T09:39:17Z | |
dc.date.available | 2022-10-31T09:39:17Z | |
dc.date.issued | 2022-11-30 | |
dc.identifier.citation | [1] Advaneo. 2022. Advaneo. Access to the world of data. https://www.advaneo- datamarketplace.de/. Last accessed: Sep’22. [2] A. Agarwal, M. Dahleh, and T. Sarkar. 2019. A Marketplace for Data: An Algo- rithmic Solution. In Proc. of ACM EC’19. [3] S. Andrés and N. Laoutaris. 2022. A Survey of Data Marketplaces and Their Business Models. ACM SIGMOD Record 51, 3 (2022). [4] S. Andrés Azcoitia, C. Iordanou, and N. Laoutaris. 2021. What Is the Price of Data? A Measurement Study of Commercial Data Marketplaces. (2021). arXiv:2111.04427 [5] S. Andrés Azcoitia, M. Paraschiv, and N. Laoutaris. 2022. Computing the Relative Value of Spatio-Temporal Data in Data Marketplaces. SIGSPATIAL’22 (2022). [6] Battlefin. 2022. Better Your Investments Using Alternative Data. https://www. battlefin.com/. Last accessed: Sep’22. [7] Y. M. Brovman, M. Jacob, N. Srinivasan, S. Neola, D. Galron, R. Snyder, and P. Wang. 2016. Optimizing Similar Item Recommendations in a Semi-Structured Marketplace to Maximize Conversion. In Proc. of RecSys’16. [8] CARTO. 2022. Marketplace. https://marketplace.carto.com/me. Last accessed: Sep’22. [9] J. Castro, D. Gomez, and J. Tejada. 2009. Polynomial calculation of the Shapley value based on sampling. Computers and Operations Research 36 (05 2009). [10] R. Castro Fernandez. 2022. Protecting Data Markets from Strategic Buyers. In Proceedings of SIGMOD’22. [11] R. Castro Fernandez, P. Subramaniam, and M. J. Franklin. 2020. Data Market Platforms: Trading Data Assets to Solve Data Problems. Proc. VLDB Endow. 13, 12 (2020). [12] S. Chawla, S. Deep, P. Koutris, and Y. Teng. 2019. Revenue maximization for query pricing. Proc. VLDB Endow. 13 (09 2019). [13] L. Chen, P. Koutris, and A. Kumar. 2019. Towards Model-Based Pricing for Machine Learning in a Data Marketplace. In Proc. of SIGMOD’19. ACM. [14] M. Dahleh. 2018. Why the Data Marketplaces of the Future Will Sell Insights, Not Data. [15] EU. 2016. General Data Protection Regulation (GDPR). [16] GeoDB. 2022. A Decentralized Big Data Ecosystem That Rewards You For The Data You Generate. https://geodb.com/. Last accessed: Sep’22. [17] A. Ghorbani and J. Zou. 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. (04 2019). [18] S&P Global. 2022. Marketplace. https://www.marketplace.spglobal.com/en/. Last accessed: Sep’22. [19] IOTA. 2022. IOTA data marketplace. https://data.iota.org/. Last accessed: Oct'22. [20] R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gürel, B. Li, C. Zhang, D. Song, and C. J. Spanos. 2019. Towards Efficient Data Valuation Based on the Shapley Value (Proc. of ML Research, Vol. 89). [21] Kaggle. 2015 (accessed Sep’22). ECML/PKDD 15: Taxi Trajectory Prediction. https://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i/data [22] N. Kourtellis, K. Katevas, and D. Perino. 2020. FLaaS: Federated Learning as a Service. In Proc. of Workshop on Distributed Machine Learning. [23] P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. 2015. Query- Based Data Pricing. J. ACM 62, 5 (2015). [24] J. Lanier. 2013. Who Owns the Future? SIMON and SCHUSTER. [25] D. Moor. 2019. Data Markets with Dynamic Arrival of Buyers and Sellers. In Proc. of NetEcon ’19. [26] State of California. 2018. California Consumer Privacy Act (CCPA). [27] City of Chicago. 2019 (accessed Sep’22). Taxi Trips. https://data.cityofchicago. org/Transportation/Taxi-Trips/wrvz-psew [28] O. Ohrimenko, S. Tople, and S. Tschiatschek. 2019. Collaborative Machine Learning Markets with Data-Replication-Robust Payments. ArXiv (2019). arXiv:1911.09052 [29] Otonomo. 2022. One-Stop Shop for Vehicle Data. https://otonomo.io/. Last accessed: Sep’22. [30] Hubert P. and Ricco G. 2018. Imperfect information in macroeconomics. Sciences Po publications (2018). [31] J. Pei. 2020. Data Pricing – From Economics to Data Science. In Proc. of SIGKDD’20. ACM. [32] E. Posner and G. Weyl. 2018. Radical Markets. Uprooting Capitalism and Democracy for a Just Society. Princeton Univ. Press. [33] Refinitiv. 2022. Data Catalog. https://www.refinitiv.com/en/financial-data. Last accessed: Sep’22. [34] B. Rozemberczki, L. Watson, P. Bayer, H. Yang, O. Kiss, S. Nilsson, and R. Sarkar. 2022. The Shapley Value in Machine Learning. arXiv:2202.05594 [35] Amazon Web Services. 2022. AWS Marketplace. https://aws.amazon.com/ marketplace. Last accessed: Sep’22. [36] Lloyd S. Shapley. 1952. A Value for n-Person Games. (1952). https://www.rand. org/pubs/papers/P0295.html [37] Y. Shen, B. Guo, Y. Shen, X. Duan, X. Dong, and H. Zhang. 2016. A pricing model for Big Personal Data. Tsinghua Science and Technology 21 (10 2016), 482–490. [38] Shutterstock. 2022. Shutterstock. https://www.shutterstock.com/. Last accessed: Oct’22. [39] Snowflake. 2022. Marketplace. https://www.snowflake.com/marketplace/. Last accessed: Sep’22. [40] Yan T. and Procaccia A. 2020. If You Like Shapley Then You’ll Love the Core. [41] TAUS. 2022. Data Marketplace. https://datamarketplace.taus.net/. Last accessed: Oct’22. [42] J. Yang, C. Zhao, and C. Xing. 2019. Big Data Market Optimization Pricing Model Based on Data Quality. Complexity 2019 (04 2019). | es |
dc.identifier.uri | https://hdl.handle.net/20.500.12761/1640 | |
dc.description.abstract | Data trading is becoming increasingly popular, as evident by the appearance of scores of data marketplaces (DMs) in the last few years satisfying the demand for third-party data. For buyers, however, deciding whether paying the requested price makes sense can only be done after having tested the data on their ML model. In this paper, we propose a method for optimizing data purchasing decisions. We show that if a marketplace provides to potential buyers a measure of the performance of their models on \emph{individual} datasets, then they can select which of them to buy with an efficacy that approximates that of knowing the performance of each possible combination of datasets offered by the DM. We call the resulting algorithm Try Before You Buy (TBYB) and demonstrate over synthetic and real-world datasets how TBYB can lead to near optimal data purchasing with only O(N) instead of O(2^N) information and execution time. | es |
dc.description.sponsorship | EU Horizon 2020 | es |
dc.language.iso | eng | es |
dc.title | Try Before You Buy: a Practical Data Purchasing Algorithm for Real-World Data Marketplaces | es |
dc.type | conference object | es |
dc.conference.date | 9 December 2022 | es |
dc.conference.place | Rome, Italy | es |
dc.conference.title | ACM Data Economy Workshop | * |
dc.event.type | workshop | es |
dc.pres.type | paper | es |
dc.type.hasVersion | AM | es |
dc.rights.accessRights | open access | es |
dc.relation.projectID | https://cordis.europa.eu/project/id/101070069 | es |
dc.relation.projectName | DataBri-X (Data Process & Technological Bricks for expanding digital value creation in European Data Spaces) | es |
dc.subject.keyword | data economy | es |
dc.subject.keyword | value of data | es |
dc.subject.keyword | data purchasing | es |
dc.subject.keyword | data marketplace | es |
dc.description.refereed | TRUE | es |
dc.description.status | pub | es |