Mostrar el registro sencillo del ítem

dc.contributor.authorDos Santos, Ricardo
dc.contributor.authorAguilar, Jose 
dc.date.accessioned2024-07-15T13:18:27Z
dc.date.available2024-07-15T13:18:27Z
dc.date.issued2024-06-30
dc.identifier.issn2192-6352es
dc.identifier.urihttps://hdl.handle.net/20.500.12761/1828
dc.description.abstractCurrently, the generation of synthetic data has become very fashionable, either due to the need to create data in certain specific contexts or to study unknown scenarios among other reasons. Additionally, synthetic data is a critical component in training machine learning models in the presence of little data. This work proposes a Synthetic Data Generation System (SDGS) architecture to allow synthetic data generation to be fully automated. SDGS is based on the Variational AutoEncoders (VAE) learning technique, and has three main capabilities. The first is related to the ability to extract data samples from multiple sources using the Linked Data (LD) paradigm. The second is linked to the ability to merge data sets to increase the amount of information that can be provided to the VAE-based synthetic data generator. The last one is related to having a Feature Engineering layer to create new features by generating or extracting information from the dataset and then selecting the features that provide the best information for the VAE model. A case study is described in detail to show the new functionalities of the SDGS, such as dataset extraction from different sources using LD, dataset merging using pivots, and the application of different feature engineering methods. Finally, two metrics are used to evaluate the quality of the generated datasets in different case studies. The first one is the accuracy to analyze the performance of the models generated with the new SDGS functionalities, obtaining results above 90%. The second one is the two-Sample Hotelling's T-Squared Test to determine the quality of the synthetic data generated by the system, obtaining synthetic datasets very similar to the original datasets.es
dc.language.isoenges
dc.publisherSpringeres
dc.titleA Synthetic Data Generation System based on the Variational-Autoencoder Technique and the Linked Data Paradigmes
dc.typejournal articlees
dc.journal.titleProgress in Artificial Intelligencees
dc.type.hasVersionAOes
dc.rights.accessRightsembargoed accesses
dc.volume.number13es
dc.identifier.doi10.1007/s13748-024-00328-xes
dc.page.final163es
dc.page.initial149es
dc.subject.keywordSynthetic Data Generator, Linked Data, Variational Autoencoders,es
dc.description.refereedTRUEes
dc.description.statuspubes


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem