计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (7): 1487-1505.DOI: 10.3778/j.issn.1673-9418.2303025
王虞,孙海春
出版日期:
2023-07-01
发布日期:
2023-07-01
WANG Yu, SUN Haichun
Online:
2023-07-01
Published:
2023-07-01
摘要: 视觉问答(visual question answering,VQA)是融合自然语言处理与计算机视觉技术的图-文跨模态热门任务。该任务以计算机智能识别与检索图像内容并给出准确答案为主要目标,融合应用了目标识别与检测、智能问答、图像属性分类、场景分析等多项技术,能够支撑许多前沿交互式人工智能高层任务,如视觉对话、视觉导航等,具有广泛的应用前景和极高的应用价值。近几年,计算机视觉、自然语言处理及图-文跨模态领域人工智能模型的发展为视觉问答任务的实现提供了许多新的技术和方法。主要对2019—2022年视觉问答领域的主流模型及专业数据集进行总结。首先,依据视觉问答任务实现的模块框架,对关键步骤中的主流技术方法进行综述讨论。其次,按照主流模型采用的技术方法,将该领域内各类模型进行细分,并简要介绍改进重点和局限性。随后,综述视觉问答常用数据集与评价指标,对几类典型模型性能进行对比阐述。最后,对现阶段视觉问答领域内亟待解决的问题进行重点阐述,并对视觉问答领域未来应用及技术发展进行预测和展望。
王虞, 孙海春. 视觉问答技术研究综述[J]. 计算机科学与探索, 2023, 17(7): 1487-1505.
WANG Yu, SUN Haichun. Review of Visual Question Answering Technology[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1487-1505.
[1] ANTOL S, AGRAWAL A, LU J, et al. VQA: visual ques-tion answering[C]//Proceedings of the 2015 IEEE Interna-tional Conference on Computer Vision, Santiago, Dec 7-13,2015. Washington: IEEE Computer Society, 2015: 2425-2433. [2] FUKUI A, PARK D H, YANG D, et al. Multimodal com-pact bilinear pooling for visual question answering and visual grounding[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Nov 1-4, 2016. Stroudsburg: ACL, 2016: 457-468. [3] WU Q, WANG P, SHEN C H, et al. Ask me anything: free-form visual question answering based on knowledge from external sources[C]//Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 4622-4630. [4] LU J S, YANG J W, BATRA D, et al. Hierarchical co-attention for visual question answering[J]. arXiv:1606. 00061, 2016. [5] LU J S, BATRA D, PARIKH D, et al. ViLBERT: pretrai-ning task-agnostic visiolinguistic representations for vision-and-language tasks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019: 13-23. [6] LU J S, GOSWAMI V, ROHRBACH M, et al. 12-in-1: multi- task vision and language representation learning[C]//Procee-dings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10434-10443. [7] TAN H, BANSAL M. LXMERT: learning cross-modality en-coder representations from transformers[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 5099-5110. [8] CADENE R, DANCETTE C, BEN-YOUNES H, et al. RUBi: reducing unimodal biases in visual question answering[C]// Proceedings of the Annual Conference on Neural Informa-tion Processing Systems 2019, Vancouver, Dec 8-14, 2019: 841-852. [9] CHEN L, YAN X H, XIAO J, et al. Counterfactual samples synthesizing for robust visual question answering[C]//Pro-ceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10797-10806. [10] SONG H Y, DONG L, ZHANG W N, et al. CLIP models are few-shot learners: empirical studies on VQA and visual entailment[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 6088-6100. [11] WANG R N, QIAN Y X, FENG F X, et al. Co-VQA: answering by interactive sub question sequence[C]//Procee-dings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, May 22-27, 2022. Stro-udsburg: ACL, 2022: 2396-2408. [12] YU Z, YU J, CUI Y H, et al. Deep modular co-attention networks for visual question answering[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6281-6290. [13] GaO P, JIANG Z K, YOU H X, et al. Dynamic fusion with intra- and inter-modality attention flow for visual question answering[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6639-6648. [14] BEN-YOUNES H, CADèNE R, THOME N, et al. BLOCK: bilinear superdiagonal fusion for visual question answering and visual relationship detection[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 8102-8109. [15] LI H, WANG P, SHEN C H, et al. Visual question answer-ing as reading comprehension[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recogni-tion, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6319-6328. [16] MANJUNATHA V, SAINI N, DAVIS L S. Explicit bias discovery in visual question answering models[C]//Procee-dings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 9562-9571. [17] WU J L, HU Z Y, MOONEY R J. Generating question relevant captions to aid visual question answering[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019.Stroudsburg: ACL, 2019: 3585-3594. [18] CADèNE R, BEN-YOUNES H, CORD M, et al. MUREL: multimodal relational reasoning for visual question ans-wering[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 1989-1998. [19] ZHOU Y Y, JI R R, SU J S, et al. Dynamic capsule attention for visual question answering[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Con-ference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 9324-9331. [20] LAO M R, GUO Y M, PU N, et al. Multi-stage hybrid embedding fusion network for visual question answering[J]. Neurocomputing, 2021, 423: 541-550. [21] SHRESTHA R, KAFLE K, KANAN C. Answer them all! Toward universal visual question answering models[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 10472-10481. [22] YANG H, LIN J Y, YANG A, et al. Prompt tuning for generative multimodal pretrained models[J]. arXiv:2208. 02532, 2022. [23] LI L J, GAN Z, CHENG Y, et al. Relation-aware graph attention network for visual question answering[C]//Procee-dings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 10312-10321. [24] WU J L, MOONEY R J. Self-critical reasoning for robust visual question answering[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019: 8601-8611. [25] JIANG H Z, MISRA I, ROHRBACH M, et al. In defense of grid features for visual question answering[J]. arXiv:2001. 03615, 2020. [26] YANG Z C, HE X D, GAO J F, et al. Stacked attention networks for image question answering[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016.Washington: IEEE Computer Society, 2016: 21-29. [27] PENG L, YANG Y, BIN Y, et al. Word-to-region attention network for visual question answering[J]. Multimedia Tools and Applications, 2019, 78(3): 3843-3858. [28] LIU F, LIU J, HONG R C, et al. Erasing-based attention learning for visual question answering[C]//Proceedings of the 27th ACM International Conference on Multimedia, Nice, Oct 21-25, 2019. New York: ACM, 2019: 1175-1183. [29] SUN Q, FU Y W. Stacked self-attention networks for visual question answering[C]//Proceedings of the 2019 Interna-tional Conference on Multimedia Retrieval, Ottawa, Jun 10-13, 2019. New York: ACM, 2019: 207-211. [30] RAHMAN T, CHOU S H, SIGAL L, et al. An improved attention for visual question answering[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jun 19-25, 2021. Piscataway: IEEE, 2021: 1653-1662. [31] PENG L, YANG Y, WANG Z, et al. CRA-Net: composed relation attention network for visual question answering[C]//Proceedings of the 27th ACM International Conference on Multimedia, Nice, Oct 21-25, 2019. New York: ACM, 2019: 1202-1210. [32] PENG L, YANG Y, WANG Z, et al. MRA-Net: improving VQA via multi-modal relation attention network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 318-329. [33] HUANG P P, HUANG J H, GUO Y Q, et al. Multi-grained attention with object-level grounding for visual question answering[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28- Aug 2, 2019. Stroudsburg: ACL, 2019: 3595-3600. [34] WU C F, LIU J L, WANG X J, et al. Differential networks for visual question answering[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Inno-vative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artifi-cial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 8997-9004. [35] ZHOU Y Y, JI R R, SUN X S, et al. Plenty is plague: fine-grained learning for visual question answering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019, 44(2): 697-709. [36] XIONG P X, SHEN Y L, JIN H X. MGA-VQA: multi-granularity alignment for visual question answering[J]. arXiv:2201.10656, 2022. [37] ZHAO Z L, SAMEL K, CHEN B H, et al. ProTo: program-guided transformer for program-guided tasks[C]//Procee-dings of the Annual Conference on Neural Information Processing Systems 2021, Dec 6-14, 2021: 17021-17036. [38] HUANG Q B, WEI J L, CAI Y, et al. Aligned dual channel graph convolutional network for visual question answering[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 7166-7176. [39] GAO P, YOU H X, ZHANG Z P, et al. Multi-modality latent interaction network for visual question answering[C]//Proceedings of the 2019 IEEE/CVF International Confer-ence on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 5824-5834. [40] ZHONG H S, CHEN J Y, SHEN C, et al. Self-adaptive neural module transformer for visual question answering[J]. IEEE Transactions on Multimedia, 2021, 23: 1264-1273. [41] GUO W Y, ZHANG Y, WU X P, et al. Re-attention for visual question answering[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 91-98. [42] WU Y Z, SUN Q, MA J Q, et al. Question guided modular routing networks for visual question answering[J]. arXiv:1904.08324, 2019. [43] LI X J, YIN X, LI C Y, et al. Oscar: object-semantics aligned pre-training for vision-language tasks[C]//LNCS 12375: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 121-137. [44] XIONG P X, YOU Q Z, YU P, et al. SA-VQA: structured alignment of visual and semantic representations for visual question answering[J]. arXiv:2201.10654, 2022. [45] GUO D L, XU C, TAO D C. Graph reasoning networks for visual question answering[J]. arXiv:1907.09815, 2019. [46] LI G H, WANG X, ZHU W W. Boosting visual question answering with context-aware knowledge aggregation[C]//Proceedings of the 28th ACM International Conference on Multimedia, Seattle, Oct 12-16, 2020. New York: ACM, 2020: 1227-1235. [47] ZHU Z H, YU J, WANG Y J, et al. Mucko: multi-layer cross-modal knowledge reasoning for fact-based visual ques-tion answering[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence, Yokohama, 2020: 1097-1103. [48] KHADEMI M. Multimodal neural graph memory networks for visual question answering[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 7177-7188. [49] ZHOU Y Y, JI R R, SUN X S, et al. K-armed bandit based multi-modal network architecture search for visual question answering[C]//Proceedings of the 28th ACM International Conference on Multimedia, Seattle, Oct 12-16, 2020. New York: ACM, 2020: 1245-1254. [50] RUWA N, MAO Q R, WANG L J, et al. Mood-aware visual question answering[J]. Neurocomputing, 2019, 330: 305-316. [51] HUDSON D A, MANNING C D. Learning by abstraction: the neural state machine[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019: 5901-5914. [52] HEO Y J, KIM E S, CHOI W S, et al. Hypergraph trans-former: weakly-supervised multi-hop reasoning for knowledge-based visual question answering[J]. arXiv:2204.10448, 2022. [53] YAMADA M, D'AMARIO V, TAKEMOTO K, et al. Trans-former module networks for systematic generalization in visual question answering[J]. arXiv:2201.11316, 2022. [54] WANG W H, BAO H B, DONG L, et al. Image as a foreign language: BEiT pretraining for all vision and vision-language tasks[J]. arXiv:2208.10442, 2022. [55] BAO H B, WANG W H, DONG L, et al. VLMo: unified vision-language pre-training with mixture-of-modality-experts[J]. arXiv:2111.02358, 2021. [56] CHEN X, WANG X, CHANGPINYO S, et al. PaLI: a jointly-scaled multilingual language-image model[J]. arXiv:2209. 06794, 2022. [57] WANG Z R, YU J H, YU A W, et al. SimVLM: simple visual language model pretraining with weak supervision[C]//Proceedings of the 10th International Conference on Learning Representations, Apr 25-29, 2022: 1-17. [58] SHAH M, CHEN X L, ROHRBACH M, et al. Cycle-consistency for robust visual question answering[C]//Procee-dings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6649-6658. [59] GUO D L, TAO D C. Learning compositional represen-tation for few-shot visual question answering[J]. arXiv:2102.10575, 2021. [60] CHEN Y, LI L J, YU L C, et al. UNITER: universal image-text representation learning[C]//LNCS 12375: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 104-120. [61] LI C L, XU H Y, TIAN J F, et al. mPLUG: effective and efficient vision-language learning by cross-modal skip-connections[J]. arXiv:2205.12005, 2022. [62] JIA C, YANG Y F, XIA Y, et al. Scaling up visual and vision-language representation learning with noisy text supervision[C]//Proceedings of the 38th International Con-ference on Machine Learning, Jul 18-24, 2021: 4904-4916. [63] ZHU X, MAO Z D, LIU C X, et al. Overcoming language priors with self-supervised learning for visual question answering[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence, Yokohama, Jul 2020: 1083-1089. [64] GUPTA V, LI Z W, KORTYLEWSKI A, et al. SwapMix: diagnosing and regularizing the over-reliance on visual context in visual question answering[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscat-away: IEEE, 2022: 5068-5078. [65] GHOSH S, BURACHAS G, RAY A, et al. Generating natural language explanations for visual question answering using scene graphs and visual attention[J]. arXiv:1902.05715, 2019. [66] AYYUBI H A, TANJIM M, MCAULEY J J, et al. Generating rationales in visual question answering[J]. arXiv: 2004.02032, 2020. [67] YAN M, XU H Y, LI C L, et al. Achieving human parity on visual question answering[J]. arXiv:2111.08896, 2021. [68] WHITEHEAD S, PETRYK S, SHAKIB V, et al. Reliable visual question answering: abstain rather than answer incor-rectly[C]//LNCS 13696: Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Oct 23-27, 2022. Cham: Springer, 2022: 148-166. [69] GAO F, PING Q, THATTAI G, et al. A thousand words are worth more than a picture: natural language-centric outside-knowledge visual question answering[J]. arXiv:2201.05299, 2022. [70] XU Y M, CHEN L, CHENG Z W, et al. Open-ended visual question answering by multi-modal domain adaptation[C]//Findings of the Association for Computational Linguistics, Nov 16-20, 2020. Stroudsburg: ACL, 2020: 367-376. [71] WU J L, LU J S, SABHARWAL A, et al. Multi-modal answer validation for knowledge-based VQA[C]//Procee-dings of the 36th AAAI Conference on Artificial Intel-ligence, the 34th Conference on Innovative Applications of Artificial Intelligence, the 12th Symposium on Educational Advances in Artificial Intelligence, Feb 22-Mar 1, 2022. Menlo Park: AAAI, 2022: 2712-2721. [72] LIN T Y, MAIRE M, BELONGIE S J, et al. Microsoft COCO: common objects in context[C]//LNCS 8693: Procee-dings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 740-755. [73] GOYAL Y, KHOT T, SUMMERS-STAY D, et al. Making the V in VQA matter: elevating the role of image under-standing in visual question answering[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 6325-6334. [74] ZHU Y K, GROTH O, BERNSTEIN M S, et al. Visual7W: grounded question answering in images[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 4995-5004. [75] JOHNSON J, HARIHARAN B, VAN DER MAATEN L, et al. CLEVR: a diagnostic dataset for compositional lang-uage and elementary visual reasoning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2901-2910. [76] HUDSON D A, MANNING C D. GQA: a new dataset for real-world visual reasoning and compositional question answering[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6700-6709. [77] MARINO K, RASTEGARI M, FARHADI A, et al. OK-VQA: a visual question answering benchmark requiring ex-ternal knowledge[C]//Proceedings of the 2019 IEEE Confe-rence on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 3195-3204. [78] FURKAN B A, RUBEN T, ANDRES M, et al. Scene text visual question answering[C]//Proceedings of the 2019 IEEE/ CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 4290-4300. [79] WANG X Y, LIU Y L, SHEN C H, et al. On the general value of evidence, and bilingual scene-text visual question answering[C]//Proceedings of the 2020 IEEE/CVF Confer-ence on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10123-10132. [80] SHENG S S, SINGH A, GOSWAMI V, et al. Human-adversarial visual question answering[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2021, Dec 6-14, 2021: 20346-20359. [81] SINGH A, NATARAJAN V T, SHAH M, et al. Towards VQA models that can read[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recog-nition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 8317-8326. [82] AGRAWAL A, BATRA D, PARIKH D. Analyzing the behavior of visual question answering models[C]//Procee-dings of the 2016 Conference on Empirical Methods in Nat-ural Language Processing, Austin, Nov 1-4, 2016. Stroud-sburg: ACL, 2016: 1955-1960. [83] GUO Y Y, CHENG Z Y, NIE L Q, et al. Quantifying and alleviating the language prior problem in visual question answering[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Infor-mation Retrieval, Paris, Jul 21-25, 2019. New York:ACM, 2019: 75-84. [84] LI J N, SELVARAJU R R, GOTMARE A, et al. Align before fuse: vision and language representation learning with momentum distillation[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2021, Dec 6-14, 2021: 9694-9705. |
[1] | 曹营利, 邓赵红, 胡曙东, 王士同. 兼顾个性特征和融合特征的阿尔茨海默病分类[J]. 计算机科学与探索, 2023, 17(7): 1658-1668. |
[2] | 石玉诚, 吴云, 龙慧云. 高级语义修复策略的跨模态融合RGB-D显著性检测[J]. 计算机科学与探索, 2023, 17(1): 140-153. |
[3] | 洪惠群, 沈贵萍, 黄风华. 表情识别技术综述[J]. 计算机科学与探索, 2022, 16(8): 1764-1778. |
[4] | 萨日娜, 李艳玲, 林民. 知识图谱推理问答研究综述[J]. 计算机科学与探索, 2022, 16(8): 1727-1741. |
[5] | 范媛媛, 李忠民. 中文医学知识图谱研究及应用进展[J]. 计算机科学与探索, 2022, 16(10): 2219-2233. |
[6] | 刘继明, 张培翔, 刘颖, 张伟东, 房杰. 多模态的情感分析技术综述[J]. 计算机科学与探索, 2021, 15(7): 1165-1182. |
[7] | 谢刚,吴高巍,任俊宏,张似衡,牛景昊,张文生. 面向患者的智能医生框架研究[J]. 计算机科学与探索, 2018, 12(9): 1475-1486. |
[8] | 李海超,李成龙,汤进,罗斌. 热红外与可见光图像融合算法研究[J]. 计算机科学与探索, 2016, 10(3): 407-413. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||