小样本目标检测综述

刘浩宇, 王向军

导航与控制 ›› 2021, Vol. 20 ›› Issue (1) : 1-14.

PDF(2162 KB)
PDF(2162 KB)
导航与控制 ›› 2021, Vol. 20 ›› Issue (1) : 1-14. DOI: 10.3969/j.issn.1674-5558.2021.01.001
综述

小样本目标检测综述

  • 刘浩宇1,2, 王向军1,2
作者信息 +

A Survey of Few-shot Object Detection

  • LIU Hao-yu1,2, WANG Xiang-jun1,2
Author information +
文章历史 +

摘要

小样本学习是指在样本数据不足或质量较低的情况下进行的深度学习训练和预测的方法。针对深度学习目标检测应用中可能会面对的样本数据不足的问题,分析了小样本目标检测的数学模型和误差来源,将适用于小样本目标检测的方法分成数据、模型和算法三个类别进行了归纳总结,简述了各个方案的缺点与不足,并枚举了近年来在小样本目标检测上的可行方法实践探索,简要介绍了其实现的效果。在此基础上,简单介绍了与小样本学习相类似的深度学习应用,并在分析了目前小样本检测中存在的问题后,对未来小样本目标检测的发展方向和研究趋势进行了讨论。

Abstract

Few-shot learning refers to the method of deep learning training and prediction under the condition of insufficient or low-quality sample data. Aiming at the problem of insufficient sample data that may be faced in the application of deep learning object detection, the mathematical model and error source of few-shot object detection are analyzed firstly. The methods applicable to few-shot object detection are given in three categories: data, model and algorithm, and the shortcomings of each scheme are attached. Based on the practical exploration of few-shot object detection, the recent attempts along with their results are enumerated. The other applications of deep learning similar to few-shot learning are also briefly introduced. Then, after analyzing the existing problems in few-shot detection, the development direction and research trend of few-shot object detection in the future are discussed.

关键词

目标检测 / 小样本学习 / 数据增强 / 增量学习 / 元学习

Key words

object detection / few-shot learning / data augmentation / incremental learning / meta-learning

引用本文

导出引用
刘浩宇, 王向军. 小样本目标检测综述[J]. 导航与控制, 2021, 20(1): 1-14 https://doi.org/10.3969/j.issn.1674-5558.2021.01.001
LIU Hao-yu, WANG Xiang-jun. A Survey of Few-shot Object Detection[J]. Navigation and Control, 2021, 20(1): 1-14 https://doi.org/10.3969/j.issn.1674-5558.2021.01.001
中图分类号: TP273+.2   

参考文献

[1] Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]. European Conference on Computer Vision, 2016: 21-37.
[2] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
[3] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning[J]. ACM Computing Surveys, 2020, 53(3): 1-34.
[4] Mitchell T M. Machine learning[M].New York: McGraw-Hill, 1997.
[5] Bottou L, Bousquet O. The tradeoffs of large scale learning[C].Proceedings of the 21st Annual Conference on Neural Information Processing Systems, 2007: 161-168.
[6] Bottou L, Curtis F E, Nocedal J. Optimization methods for large-scale machine learning[J]. SIAM Review, 2016, 60(2): 223-311.
[7] Qi H, Brown M, Lowe D G. Low-shot learning with imprinted weights[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 5822-5830.
[8] Shyam P, Gupta S, Dukkipati A. Attentive recurrent comparators[C]. Proceedings of the 34th International Conference on Machine Learning, 2017: 3173-3181.
[9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction[J]. Science, 2015, 350(6266): 1332-1338.
[10] Garcia V, Bruna J. Few-shot learning with graph neural networks[EB/OL]. https://arxiv.org/pdf/1711.04043.pdf.
[11] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C].IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[12] Redmon J, Farhadi A. YOLOv3: an incremental improvement[EB/OL]. https://arxiv.org/pdf/1804.02767.pdf.
[13] Redmon J, Farhadi A. YOLO9000: better, faster, stron-ger[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525.
[14] Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features[C].IEEE Internatio-nal Conference on Computer Vision, 2017: 3037-3046.
[15] Dwibedi D, Misra I, Hebert M. Cut, paste and learn: surprisingly easy synthesis for instance detection[C].IEEE International Conference on Computer Vision, 2017: 1310-1319.
[16] Lemley J, Bazrafkan S, Corcoran P. Smart augmentation learning an optimal data augmentation strategy[J]. IEEE Access, 2017, 5: 5858-5869.
[17] Zoph B, Cubuk E D, Ghiasi G, et al. Learning data augmentation strategies for object detection[C]. European Conference on Computer Vision, 2020: 566-583.
[18] Douze M, Szlam A, Hariharan B, et al. Low-shot learning with large-scale diffusion[C].IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3349-3358.
[19] Wang X L, Shrivastava A, Gupta A, et al. A-Fast-RCNN: hard positive generation via adversary for object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3039-3048.
[20] Ratner A J, Ehrenberg H R, Hussain Z, et al. Learning to compose domain—specific transformations for data augmentation[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 3239-3249.
[21] Tang P, Ramaiah C, Wang Y, et al. Proposal learning for semi-supervised object detection[EB/OL]. https://arxiv.org/pdf/2001.05086.pdf.
[22] Nguyen N V, Rigaud C, Burie J C. Semi-supervised object detection with unlabeled data[C]. 14th International Conference on Computer Vision Theory and Applications, 2019: 289-296.
[23] Huang S W, Lin C T, Chen S P, et al. AugGAN: cross domain adaptation with GAN-based data augmentation[C]. European Conference on Computer Vision, 2018: 731-744.
[24] Caruana R. Multitask learning[J]. Machine Learning, 1997, 28: 41-75.
[25] Sermanet P, Eigen D, Zhang X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks[EB/OL]. https://arxiv.org/pdf/1312.6229.pdf.
[26] Dong X Y, Zheng L, Ma F, et al. Few-example object detection with model communication[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2019, 41(7): 1641-1654.
[27] Luo Z L, Zou Y L, Hoffman J, et al. Label efficient learning of transferable representations across domains and tasks[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 164-176.
[28] Chabot F, Chaouch M, Rabarisoa J, et al. Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image[C].IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1827-1836.
[29] Zhang K P, Zhang Z P, Li Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503.
[30] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012: 1097-1105.
[31] Pérez-Rúa J M, Zhu X T, Hospedales T, et al. Incremental few-shot object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 13843-13852.
[32] Zhou X Y, Wang D Q, Krähenbühl P. Objects as points[EB/OL]. https://arxiv.org/pdf/1904.07850.pdf.
[33] Peng C, Zhao K, Lovell B C. Faster ILOD: incremental learning for object detectors based on Faster RCNN[J]. Pattern Recognition Letters, 2020,140: 109-115.
[34] Shmelkov K, Schmid C, Alahari K. Incremental learning of object detectors without catastrophic forgetting[EB/OL]. IEEE International Conference on Computer Vision, 2017: 3420-3429.
[35] Li D W, Tasci S, Ghosh S, et al. RILOD: near real-time incremental learning for object detection at the edge[C]. Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019: 113-126.
[36] Pan S J, Tsang I W, Kwok J T, et al. Domain adaptation via transfer component analysis[J].IEEE Transactions on Neural Networks, 2011, 22(2): 199-210.
[37] Kang B Y, Liu Z, Wang X, et al. Few-shot object detection via feature reweighting[C]. IEEE/CVF International Conference on Computer Vision, 2019: 8419-8428.
[38] Wang T, Zhang X P, Yuan L, et al. Few-shot adaptive Faster R-CNN[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 7166-7175.
[39] Zhang J Y, Chen Z L, Huang J Y, et al. Few-shot domain adaptation for semantic segmentation[C]. Proceedings of the ACM Turing Celebration Conference, 2019: 1-6.
[40] Motiian S, Jones Q, Iranmanesh S M, et al. Few-shot adversarial domain adaptation[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6673-6683.
[41] Chen H, Wang Y, Wang G Y, et al. LSTD: a low-shot transfer detector for object detection[EB/OL]. https://arxiv.org/pdf/1803.01529.pdf.
[42] Schwartz E, Karlinsky L, Shtok J, et al. RepMet: representative-based metric learning for classification and one-shot object detection[EB/OL]. https://arxiv.org/pdf/1806.04728.pdf.
[43] Hao F S, Cheng J, Wang L, et al. Instance-Level embedding adaptation for few-shot learning[J]. IEEE Access, 2019, 7: 100501-100511..
[44] Musgrave K, Belongie S, Lim S N. A metric learning reality check[C]. European Conference on Computer Vision,2020: 681-699.
[45] Singh B, Davis L S. An analysis of scale invariance in object detectio—SNIP[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3578-3587.
[46] Wang G T, Luo C, Sun X Y, et al. Tracking by instance detection: a Meta-learning approach[EB/OL]. https://arxiv.org/pdf/2004.00830.pdf.
[47] Fu K, Zhang T F, Zhang Y, et al. Meta-SSD: towards fast adaptation for few-shot object detection with Meta-learning[J]. IEEE Access, 2019, 7: 77597-77606.
[48] Yan X P, Chen Z L, Xu A, et al. Meta R-CNN : towards general solver for instance-level low-shot learning[EB/OL]. https://arxiv.org/pdf/1909.13032.pdf.
[49] Zhu X J, Goldberg A B. Introduction to semi-supervised learning[M]. San Rafael: Morgan and Claypool Publis-hers, 2009.
[50] He H B, Garcia E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263-1284.
[51] Lampert C H, Nickisch H, Harmeling S. Learning to detect unseen object classes by between-class attribute transfer[C].IEEE Conference on Computer Vision and Pattern Recognition, 2009: 951-958.
[52] Franceschi L, Frasconi P, Salzo S, et al. Bilevel programming for hyperparameter optimization andMeta-learning[C]. Proceedings of the 35th International Conference on Machine Learning, 2018: 1568-1577.
[53] Kirkpatricka J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(13): 3521-3526.
[54] Kanter J M, Veeramachaneni K. Deep feature synthesis: towards automating data science endeavors[C].IEEE International Conference on Data Science and Advanced Analytics, 2015: 717-726.
[55] Kotthoff L, Thornton C, Hoos H H, et al. Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA[J]. Journal of Machine Learning Research, 2017, 18(25): 1-5.
[56] Zoph B, Le Q V. Neural architecture search with reinforcement learning[C]. Proceedings of the 34th Interna-tional Conference on Machine Learning, 2017: 459-468.
[57] Kaiser Ł, Nachum O, Roy A, et al. Learning to remem-ber rare events[EB/OL]. https://arxiv.org/pdf/1703.03129.pdf.

基金

天津大学自主创新基金(编号:202003)
PDF(2162 KB)

1859

Accesses

0

Citation

Detail

段落导航
相关文章

/