TY - GEN
T1 - Missing Data Imputation Using Data Generated by GAN
AU - Hammad Alharbi, Hanan
AU - Kimura, Masaomi
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/8/5
Y1 - 2020/8/5
N2 - Missing data is a common and challenging problem that arises in many research domains and led to the complication of data analysis. Therefore, handling missing data is a necessity as proposed in many previous studies. In this paper, we proposed two methods to impute missing numerical datasets based on generated data by GAN and determine the imputed values using Euclidian distance. In various missing percentages, we evaluated the imputation accuracy of all methods using MAE and RMSE tests. The proposed methods randomGAN and meshGAN produce the best imputation accuracy in 2 out of 4 datasets against three compared methods: SimpleImputer, IterativeImputer, and KNNimputer.
AB - Missing data is a common and challenging problem that arises in many research domains and led to the complication of data analysis. Therefore, handling missing data is a necessity as proposed in many previous studies. In this paper, we proposed two methods to impute missing numerical datasets based on generated data by GAN and determine the imputed values using Euclidian distance. In various missing percentages, we evaluated the imputation accuracy of all methods using MAE and RMSE tests. The proposed methods randomGAN and meshGAN produce the best imputation accuracy in 2 out of 4 datasets against three compared methods: SimpleImputer, IterativeImputer, and KNNimputer.
KW - GAN
KW - KNNimputer
KW - Machine learning
KW - Missing data imputation
UR - http://www.scopus.com/inward/record.url?scp=85096339994&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096339994&partnerID=8YFLogxK
U2 - 10.1145/3418688.3418701
DO - 10.1145/3418688.3418701
M3 - Conference contribution
AN - SCOPUS:85096339994
T3 - ACM International Conference Proceeding Series
SP - 73
EP - 77
BT - ICCBD 2020 - 2020 3rd International Conference on Computing and Big Data
PB - Association for Computing Machinery
T2 - 3rd International Conference on Computing and Big Data, ICCBD 2020 and its Workshop the 2020 2nd International Conference on Computer, Software Engineering and Applications, CSEA 2020
Y2 - 5 August 2020 through 7 August 2020
ER -