[1] ZENG Z N, YAO Y, LIU Z Y, et al. A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals[J]. Nature Communications, 2022, 13: 862.
[2] EDWARDS C, LAI T, ROS K, et al. Translation between molecules and natural language[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2022: 375-413.
[3] LIU Z Q, ZHANG W, XIA Y C, et al. MolXPT: wrapping molecules with text for generative pre-training[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2023: 1606-1616.
[4] WEININGER D. SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules[J]. Journal of Chemical Information & Computer Sciences, 1988, 28(1): 31-36.
[5] SU B, DU D Z, YANG Z, et al. A molecular multimodal foundation model associating molecule graphs with natural language[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2209. 05481.
[6] LIU S C, NIE W L, WANG C P, et al. Multi-modal molecule structure-text model for text-based retrieval and editing[J]. Nature Machine Intelligence, 2023, 5(12): 1447-1457.
[7] ZHAO W Y, ZHOU D, CAO B Q, et al. Adversarial modality alignment network for cross-modal molecule retrieval[J]. IEEE Transactions on Artificial Intelligence, 2024, 5(1): 278-289.
[8] SONG J, ZHUANG W R, LIN Y J, et al. Towards cross-modal text-molecule retrieval with better modality alignment[C]//Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE, 2024: 1161-1168.
[9] MIN Z J, LIU B S, ZHANG L, et al. Exploring optimal transport-based multi-grained alignments for text-molecule retrieval[C]//Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE, 2024: 2317-2324.
[10] EDWARDS C, ZHAI C X, JI H. Text2Mol: cross-modal molecule retrieval with natural language queries[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 595-607.
[11] NOMA Y. Inverse stereographic projecting hashing for fast similarity search[J]. Journal of Information Processing, 2017, 25: 366-375.
[12] NOMA Y, KONOSHIMA M. Eclipse hashing: Alexandrov compactification and hashing with hyperspheres for fast similarity search[EB/OL]. [2025-08-10]. https://arxiv.org/abs/1406.3882.
[13] WU H Y, ZENG P J, ZHENG W X, et al. CLASS: enhancing cross-modal text-molecule retrieval performance and training efficiency[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2502. 11633.
[14] HARDOON D R, SZEDMAK S, SHAWE-TAYLOR J. Canonical correlation analysis: an overview with application to learning methods[J]. Neural Computation, 2004, 16(12): 2639-2664.
[15] LAI P L, FYFE C. Kernel and nonlinear canonical correlation analysis[J]. International Journal of Neural Systems, 2000, 10(5): 365-377.
[16] ANDREW G, ARORA R, BILMES J, et al. Deep canonical correlation analysis[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning. New York: ACM, 2013: 1247-1255.
[17] WEISS Y, TORRALBA A, FERGUS R. Spectral hashing[C]//Advances in Neural Information Processing Systems 21, 2008.
[18] ZHANG D, WANG J, CAI D, et al. Self-taught hashing for fast similarity search[C]//Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010: 18-25.
[19] GONG Y C, LAZEBNIK S, GORDO A, et al. Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2916-2929.
[20] KIM S, THIESSEN P A, BOLTON E E, et al. PubChem substance and compound databases[J]. Nucleic Acids Research, 2016, 44(D1): 1202-1213.
[21] HASTINGS J, OWEN G, DEKKER A, et al. ChEBI in 2016: improved services and an expanding collection of metabolites[J]. Nucleic Acids Research, 2016, 44(D1): 1214-1219.
[22] HAN J W, PEI J. Mining frequent patterns by pattern-growth: methodology and implications[J]. ACM SIGKDD Explorations Newsletter, 2000, 2(2): 14-20.
[23] LIU Z Y, LI S H, LUO Y C, et al. MolCA: molecular graph-language modeling with cross-modal projector and uni-modal adapter[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2023: 15623-15638.
[24] ZHANG Y K, YE G Y, YUAN C H, et al. Atomas: hierarchical alignment on molecule-text for unified molecule understanding and generation[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2404.16880.
[25] BELTAGY I, LO K, COHAN A. SciBERT: a pretrained language model for scientific text[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 3613-3618.
[26] LUO Y Z, YANG K, HONG M, et al. MolFM: a multimodal molecular foundation model[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2307.09484.
[27] JAEGER S, FULLE S, TURK S. Mol2vec: unsupervised machine learning approach with chemical intuition[J]. Journal of Chemical Information and Modeling, 2018, 58(1): 27-35.
[28] KINGMA D P. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015. |