小样本目标检测综述

doi:10.3969/j.issn.1674-5558.2021.01.001

PDF(2162 KB)

导航与控制 ›› 2021, Vol. 20 ›› Issue (1) : 1-14. DOI: 10.3969/j.issn.1674-5558.2021.01.001

综述

小样本目标检测综述

刘浩宇^1,2, 王向军^1,2

作者信息 +

A Survey of Few-shot Object Detection

LIU Hao-yu^1,2, WANG Xiang-jun^1,2

Author information +

文章历史 +

摘要

小样本学习是指在样本数据不足或质量较低的情况下进行的深度学习训练和预测的方法。针对深度学习目标检测应用中可能会面对的样本数据不足的问题,分析了小样本目标检测的数学模型和误差来源,将适用于小样本目标检测的方法分成数据、模型和算法三个类别进行了归纳总结,简述了各个方案的缺点与不足,并枚举了近年来在小样本目标检测上的可行方法实践探索,简要介绍了其实现的效果。在此基础上,简单介绍了与小样本学习相类似的深度学习应用,并在分析了目前小样本检测中存在的问题后,对未来小样本目标检测的发展方向和研究趋势进行了讨论。

Abstract

Few-shot learning refers to the method of deep learning training and prediction under the condition of insufficient or low-quality sample data. Aiming at the problem of insufficient sample data that may be faced in the application of deep learning object detection, the mathematical model and error source of few-shot object detection are analyzed firstly. The methods applicable to few-shot object detection are given in three categories: data, model and algorithm, and the shortcomings of each scheme are attached. Based on the practical exploration of few-shot object detection, the recent attempts along with their results are enumerated. The other applications of deep learning similar to few-shot learning are also briefly introduced. Then, after analyzing the existing problems in few-shot detection, the development direction and research trend of few-shot object detection in the future are discussed.

导出引用

刘浩宇, 王向军. 小样本目标检测综述[J]. 导航与控制, 2021, 20(1): 1-14 https://doi.org/10.3969/j.issn.1674-5558.2021.01.001

LIU Hao-yu, WANG Xiang-jun. A Survey of Few-shot Object Detection[J]. Navigation and Control, 2021, 20(1): 1-14 https://doi.org/10.3969/j.issn.1674-5558.2021.01.001

中图分类号： TP273+.2

参考文献

[1] Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]. European Conference on Computer Vision, 2016: 21-37.
[2] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
[3] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning[J]. ACM Computing Surveys, 2020, 53(3): 1-34.
[4] Mitchell T M. Machine learning[M].New York: McGraw-Hill, 1997.
[5] Bottou L, Bousquet O. The tradeoffs of large scale learning[C].Proceedings of the 21^st Annual Conference on Neural Information Processing Systems, 2007: 161-168.
[6] Bottou L, Curtis F E, Nocedal J. Optimization methods for large-scale machine learning[J]. SIAM Review, 2016, 60(2): 223-311.
[7] Qi H, Brown M, Lowe D G. Low-shot learning with imprinted weights[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 5822-5830.
[8] Shyam P, Gupta S, Dukkipati A. Attentive recurrent comparators[C]. Proceedings of the 34^th International Conference on Machine Learning, 2017: 3173-3181.
[9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction[J]. Science, 2015, 350(6266): 1332-1338.
[10] Garcia V, Bruna J. Few-shot learning with graph neural networks[EB/OL]. https://arxiv.org/pdf/1711.04043.pdf.
[11] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C].IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[12] Redmon J, Farhadi A. YOLOv3: an incremental improvement[EB/OL]. https://arxiv.org/pdf/1804.02767.pdf.
[13] Redmon J, Farhadi A. YOLO9000: better, faster, stron-ger[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525.
[14] Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features[C].IEEE Internatio-nal Conference on Computer Vision, 2017: 3037-3046.
[15] Dwibedi D, Misra I, Hebert M. Cut, paste and learn: surprisingly easy synthesis for instance detection[C].IEEE International Conference on Computer Vision, 2017: 1310-1319.
[16] Lemley J, Bazrafkan S, Corcoran P. Smart augmentation learning an optimal data augmentation strategy[J]. IEEE Access, 2017, 5: 5858-5869.
[17] Zoph B, Cubuk E D, Ghiasi G, et al. Learning data augmentation strategies for object detection[C]. European Conference on Computer Vision, 2020: 566-583.
[18] Douze M, Szlam A, Hariharan B, et al. Low-shot learning with large-scale diffusion[C].IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3349-3358.
[19] Wang X L, Shrivastava A, Gupta A, et al. A-Fast-RCNN: hard positive generation via adversary for object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3039-3048.
[20] Ratner A J, Ehrenberg H R, Hussain Z, et al. Learning to compose domain—specific transformations for data augmentation[C]. Proceedings of the 31^st International Conference on Neural Information Processing Systems, 2017: 3239-3249.
[21] Tang P, Ramaiah C, Wang Y, et al. Proposal learning for semi-supervised object detection[EB/OL]. https://arxiv.org/pdf/2001.05086.pdf.
[22] Nguyen N V, Rigaud C, Burie J C. Semi-supervised object detection with unlabeled data[C]. 14^th International Conference on Computer Vision Theory and Applications, 2019: 289-296.
[23] Huang S W, Lin C T, Chen S P, et al. AugGAN: cross domain adaptation with GAN-based data augmentation[C]. European Conference on Computer Vision, 2018: 731-744.
[24] Caruana R. Multitask learning[J]. Machine Learning, 1997, 28: 41-75.
[25] Sermanet P, Eigen D, Zhang X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks[EB/OL]. https://arxiv.org/pdf/1312.6229.pdf.
[26] Dong X Y, Zheng L, Ma F, et al. Few-example object detection with model communication[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2019, 41(7): 1641-1654.
[27] Luo Z L, Zou Y L, Hoffman J, et al. Label efficient learning of transferable representations across domains and tasks[C]. Proceedings of the 31^st International Conference on Neural Information Processing Systems, 2017: 164-176.
[28] Chabot F, Chaouch M, Rabarisoa J, et al. Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image[C].IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1827-1836.
[29] Zhang K P, Zhang Z P, Li Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503.
[30] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25^th International Conference on Neural Information Processing Systems, 2012: 1097-1105.
[31] Pérez-Rúa J M, Zhu X T, Hospedales T, et al. Incremental few-shot object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 13843-13852.
[32] Zhou X Y, Wang D Q, Krähenbühl P. Objects as points[EB/OL]. https://arxiv.org/pdf/1904.07850.pdf.
[33] Peng C, Zhao K, Lovell B C. Faster ILOD: incremental learning for object detectors based on Faster RCNN[J]. Pattern Recognition Letters, 2020,140: 109-115.
[34] Shmelkov K, Schmid C, Alahari K. Incremental learning of object detectors without catastrophic forgetting[EB/OL]. IEEE International Conference on Computer Vision, 2017: 3420-3429.
[35] Li D W, Tasci S, Ghosh S, et al. RILOD: near real-time incremental learning for object detection at the edge[C]. Proceedings of the 4^th ACM/IEEE Symposium on Edge Computing, 2019: 113-126.
[36] Pan S J, Tsang I W, Kwok J T, et al. Domain adaptation via transfer component analysis[J].IEEE Transactions on Neural Networks, 2011, 22(2): 199-210.
[37] Kang B Y, Liu Z, Wang X, et al. Few-shot object detection via feature reweighting[C]. IEEE/CVF International Conference on Computer Vision, 2019: 8419-8428.
[38] Wang T, Zhang X P, Yuan L, et al. Few-shot adaptive Faster R-CNN[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 7166-7175.
[39] Zhang J Y, Chen Z L, Huang J Y, et al. Few-shot domain adaptation for semantic segmentation[C]. Proceedings of the ACM Turing Celebration Conference, 2019: 1-6.
[40] Motiian S, Jones Q, Iranmanesh S M, et al. Few-shot adversarial domain adaptation[C]. Proceedings of the 31^st International Conference on Neural Information Processing Systems, 2017: 6673-6683.
[41] Chen H, Wang Y, Wang G Y, et al. LSTD: a low-shot transfer detector for object detection[EB/OL]. https://arxiv.org/pdf/1803.01529.pdf.
[42] Schwartz E, Karlinsky L, Shtok J, et al. RepMet: representative-based metric learning for classification and one-shot object detection[EB/OL]. https://arxiv.org/pdf/1806.04728.pdf.
[43] Hao F S, Cheng J, Wang L, et al. Instance-Level embedding adaptation for few-shot learning[J]. IEEE Access, 2019, 7: 100501-100511..
[44] Musgrave K, Belongie S, Lim S N. A metric learning reality check[C]. European Conference on Computer Vision,2020: 681-699.
[45] Singh B, Davis L S. An analysis of scale invariance in object detectio—SNIP[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3578-3587.
[46] Wang G T, Luo C, Sun X Y, et al. Tracking by instance detection: a Meta-learning approach[EB/OL]. https://arxiv.org/pdf/2004.00830.pdf.
[47] Fu K, Zhang T F, Zhang Y, et al. Meta-SSD: towards fast adaptation for few-shot object detection with Meta-learning[J]. IEEE Access, 2019, 7: 77597-77606.
[48] Yan X P, Chen Z L, Xu A, et al. Meta R-CNN : towards general solver for instance-level low-shot learning[EB/OL]. https://arxiv.org/pdf/1909.13032.pdf.
[49] Zhu X J, Goldberg A B. Introduction to semi-supervised learning[M]. San Rafael: Morgan and Claypool Publis-hers, 2009.
[50] He H B, Garcia E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263-1284.
[51] Lampert C H, Nickisch H, Harmeling S. Learning to detect unseen object classes by between-class attribute transfer[C].IEEE Conference on Computer Vision and Pattern Recognition, 2009: 951-958.
[52] Franceschi L, Frasconi P, Salzo S, et al. Bilevel programming for hyperparameter optimization andMeta-learning[C]. Proceedings of the 35^th International Conference on Machine Learning, 2018: 1568-1577.
[53] Kirkpatricka J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(13): 3521-3526.
[54] Kanter J M, Veeramachaneni K. Deep feature synthesis: towards automating data science endeavors[C].IEEE International Conference on Data Science and Advanced Analytics, 2015: 717-726.
[55] Kotthoff L, Thornton C, Hoos H H, et al. Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA[J]. Journal of Machine Learning Research, 2017, 18(25): 1-5.
[56] Zoph B, Le Q V. Neural architecture search with reinforcement learning[C]. Proceedings of the 34^th Interna-tional Conference on Machine Learning, 2017: 459-468.
[57] Kaiser Ł, Nachum O, Roy A, et al. Learning to remem-ber rare events[EB/OL]. https://arxiv.org/pdf/1703.03129.pdf.

基金

天津大学自主创新基金(编号:202003)

PDF(2162 KB)

1859

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

收稿日期	出版日期
2020-04-18	2021-02-05
发布日期
2021-02-05

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金