Journal of Frontiers of Computer Science and Technology
• Science Researches • Next Articles
LI Huimin, MA Jianwei, ZANG Shaofei, SONG Yanbing
李慧敏, 马建伟, 臧绍飞, 宋彦兵
Abstract: Cross-domain mean approximation (CDMA) is an efficient measure of distributional differences between domains. It measures the sample distribution difference between domains by calculating the distance from a sample in one domain to the sample mean in another domain, which in turn can facilitate cross-domain migration of knowledge. However, in practical applications, the marginal and conditional distributions of data are unbalanced, and CDMA pursues equally to measure the difference between the marginal and conditional distributions without considering the difference between the two, which leads to its inefficiency in transfer learning. For this reason, this study firstly improves CDMA by introducing an adaptation factor and designing dynamic CDMA to evaluate the edge distribution error and conditional distribution error between the source and target domains; secondly, on the basis of dynamic CDMA, a dynamic adaptive cross-domain mean approximation (DA-CDMA) feature extraction algorithm is proposed to extract features that are invariant between domains in order to realize cross-domain migration of knowledge. In addition, in order to reduce the shift of mean value caused by individual bad samples far away from the mean center during the feature extraction process, a mean update mechanism is proposed to update the mean value within the class to increase the stability and accuracy of migration. Finally, classification experiments are conducted on publicly available migration learning datasets to verify the effectiveness of the method.
Key words: distribution difference, cross-domain mean approximation, dynamic adaptation, distribution difference measurement, classification, mean updating
摘要: 跨域均值逼近(CDMA)是一种高效的领域间分布差异度量方法。它通过计算一个域的样本到另一个域样本均值的距离来度量领域间的样本分布差异,进而可以促进知识的跨领域迁移。然而,在实际应用中,数据的边缘分布和条件分布往往是不平衡的,CDMA平等地追求度量边缘分布和条件分布的差异,而不考虑二者之间的差异性,导致其在迁移学习中的效率不高。为此,本研究首先对CDMA进行改进,引入适应性因子,设计动态CDMA评估源域和目标域之间的边缘分布误差和条件分布误差;其次,在动态CDMA基础上,提出动态自适应跨域均值逼近(DA-CDMA)特征提取算法,来提取领域间不变的特征,以实现知识的跨领域迁移。此外,特征提取过程中为减小个别远离均值中心的不良样本对均值造成的偏移,提出均值更新机制,在类内进行均值更新,增加迁移的稳定性和准确性。最后,在公开的迁移学习数据集上进行分类实验,验证了该方法的有效性。
关键词: 分布差异, 跨域均值逼近, 动态自适应, 分布差异度量, 分类, 均值更新
LI Huimin, MA Jianwei, ZANG Shaofei, SONG Yanbing. Dynamic adaptive cross-domain mean approximation[J]. Journal of Frontiers of Computer Science and Technology, DOI: 10.3778/j.issn.1673-9418.2405070.
李慧敏, 马建伟, 臧绍飞, 宋彦兵. 动态自适应跨域均值逼近[J]. 计算机科学与探索, DOI: 10.3778/j.issn.1673-9418.2405070.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2405070
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/