[1] Aizawa K, Maruyama Y, Li H, et al. Food balance estimation by using personal dietary tendencies in a multimedia food log[J]. IEEE Transactions on Multimedia, 2013, 15(8): 2176- 2185.
[2] Hassannejad H, Matrella G, Ciampolini P, et al. Automatic diet monitoring: a review of computer vision and wearable sensor-based methods[J]. International Journal of Food Sciences and Nutrition, 2017, 68(6): 656-670.
[3] Ortiz A, Covic A, Fliser D, et al. Epidemiology, contributors to, and clinical trials of mortality risk in chronic kidney failure[J]. The Lancet, 2014, 383(9931): 1831-1843.
[4] Zhang L X, Wang F, Wang L, et al. Prevalence of chronic kidney disease in China: a cross-sectional survey[J]. The Lancet, 2012, 379(9818): 815-822.
[5] Aizawa K, Ogawa M. FoodLog: multimedia tool for healthcare applications[J]. IEEE MultiMedia, 2015, 22(2): 4-8.
[6] Tanno R, Okamoto K, Yanai K. Deepfoodcam: a DCNN-based real-time mobile food recognition system[C]//Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, Oct 16, 2016. New York: ACM, 2016: 89.
[7] Ming Z Y, Chen J J, Cao Y, et al. Food photo recognition for dietary tracking: system and experiment[C]//LNCS 10705: Proceedings of the 2018 International Conference on MultiMedia Modeling, Bangkok, Feb 5-7, 2018. Berlin, Heidelberg: Springer, 2018: 129-141.
[8] Chen J J, Ngo C W. Deep-based ingredient recognition for cooking recipe retrieval[C]//Proceedings of the 2016 ACM Conference on Multimedia Conference, Amsterdam, Oct 15-19, 2016. New York: ACM, 2016: 32-41.
[9] Salvador A, Hynes N, Aytar Y, et al. Learning cross-modal embeddings for cooking recipes and food images[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 3020-3028.
[10] Chen J J, Ngo C W, Feng F L, et al. Deep understanding of cooking procedure for cross-modal recipe retrieval[C]//Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Oct 22-26, 2018. New York: ACM, 2018: 1020-1028.
[11] Carvalho M, Cadène R, Picard D, et al. Cross-modal retrieval in the cooking context: learning semantic text-image embeddings[C]//Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, Jul 8-12, 2018. New York: ACM, 2018: 35-44.
[12] Wang K, Yin Q, Wang W, et al. A comprehensive survey on cross-modal retrieval[J]. arXiv:1607.06215, 2016.
[13] Yamakata Y, Imahori S, Maeta H, et al. A method for extracting major workflow composed of ingredients, tools, and actions from cooking procedural text[C]//Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, Seattle, Jul 11-15, 2016. Washington: IEEE Computer Society, 2016: 1-6.
[14] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 2017 Annual Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 5998-6008.
[15] Lai P L, Fyfe C. Kernel and nonlinear canonical correlation analysis[J]. International Journal of Neural Systems, 2000, 10(5): 365-377.
[16] Andrew G, Arora R, Bilmes J, et al. Deep canonical correlation analysis[C]//Proceedings of the 30th International Conference on Machine Learning, Atlanta, Jun 16-21, 2013: 1247-1255.
[17] Feng F X, Wang X J, Li R F. Cross-modal retrieval with correspondence autoencoder[C]//Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, Nov 3-7, 2014. New York: ACM, 2014: 7-16.
[18] Socher R, Karpathy A, Le Q V, et al. Grounded compositional semantics for finding and describing images with sentences[J]. Transactions of the Association for Computational Linguistics, 2014, 2: 207-218.
[19] Zhan Y B, Yu J, Yu Z, et al. Comprehensive distance- preserving autoencoders for cross-modal retrieval[C]//Procee-dings of the 26th ACM International Conference on Multimedia, Seoul, Oct 22-26, 2018. New York: ACM, 2018: 1137-1145.
[20] Yanai K, Kawano Y. Food image recognition using deep convolutional network with pre-training and fine-tuning [C]//Proceedings of the 2015 IEEE International Conference on Multimedia & Expo Workshops, Turin, Jun 29-Jul 3, 2015. Washington: IEEE Computer Society, 2015: 1-6.
[21] Meyers A, Johnston N, Rathod V, et al. Im2Calories: towards an automated mobile vision food diary[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 1233-1241.
[22] Chen J J, Ngo C W, Chua T S. Cross-modal recipe retrieval with rich food attributes[C]//Proceedings of the 2017 ACM Conference on Multimedia, Mountain View, Oct 23-27, 2017. New York: ACM, 2017: 1771-1779.
[23] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[24] Castrejón L, Aytar Y, Vondrick C, et al. Learning aligned cross-modal representations from weakly aligned data[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2940-2949. |