Few-shot learning refers to the method of deep learning training and prediction under the condition of insufficient or low-quality sample data. Aiming at the problem of insufficient sample data that may be faced in the application of deep learning object detection, the mathematical model and error source of few-shot object detection are analyzed firstly. The methods applicable to few-shot object detection are given in three categories: data, model and algorithm, and the shortcomings of each scheme are attached. Based on the practical exploration of few-shot object detection, the recent attempts along with their results are enumerated. The other applications of deep learning similar to few-shot learning are also briefly introduced. Then, after analyzing the existing problems in few-shot detection, the development direction and research trend of few-shot object detection in the future are discussed.
LIU Hao-yu, WANG Xiang-jun.
A Survey of Few-shot Object Detection[J]. Navigation and Control, 2021, 20(1): 1-14 https://doi.org/10.3969/j.issn.1674-5558.2021.01.001
[1] Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]. European Conference on Computer Vision, 2016: 21-37. [2] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359. [3] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning[J]. ACM Computing Surveys, 2020, 53(3): 1-34. [4] Mitchell T M. Machine learning[M].New York: McGraw-Hill, 1997. [5] Bottou L, Bousquet O. The tradeoffs of large scale learning[C].Proceedings of the 21st Annual Conference on Neural Information Processing Systems, 2007: 161-168. [6] Bottou L, Curtis F E, Nocedal J. Optimization methods for large-scale machine learning[J]. SIAM Review, 2016, 60(2): 223-311. [7] Qi H, Brown M, Lowe D G. Low-shot learning with imprinted weights[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 5822-5830. [8] Shyam P, Gupta S, Dukkipati A. Attentive recurrent comparators[C]. Proceedings of the 34th International Conference on Machine Learning, 2017: 3173-3181. [9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction[J]. Science, 2015, 350(6266): 1332-1338. [10] Garcia V, Bruna J. Few-shot learning with graph neural networks[EB/OL]. https://arxiv.org/pdf/1711.04043.pdf. [11] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C].IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788. [12] Redmon J, Farhadi A. YOLOv3: an incremental improvement[EB/OL]. https://arxiv.org/pdf/1804.02767.pdf. [13] Redmon J, Farhadi A. YOLO9000: better, faster, stron-ger[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525. [14] Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features[C].IEEE Internatio-nal Conference on Computer Vision, 2017: 3037-3046. [15] Dwibedi D, Misra I, Hebert M. Cut, paste and learn: surprisingly easy synthesis for instance detection[C].IEEE International Conference on Computer Vision, 2017: 1310-1319. [16] Lemley J, Bazrafkan S, Corcoran P. Smart augmentation learning an optimal data augmentation strategy[J]. IEEE Access, 2017, 5: 5858-5869. [17] Zoph B, Cubuk E D, Ghiasi G, et al. Learning data augmentation strategies for object detection[C]. European Conference on Computer Vision, 2020: 566-583. [18] Douze M, Szlam A, Hariharan B, et al. Low-shot learning with large-scale diffusion[C].IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3349-3358. [19] Wang X L, Shrivastava A, Gupta A, et al. A-Fast-RCNN: hard positive generation via adversary for object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3039-3048. [20] Ratner A J, Ehrenberg H R, Hussain Z, et al. Learning to compose domain—specific transformations for data augmentation[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 3239-3249. [21] Tang P, Ramaiah C, Wang Y, et al. Proposal learning for semi-supervised object detection[EB/OL]. https://arxiv.org/pdf/2001.05086.pdf. [22] Nguyen N V, Rigaud C, Burie J C. Semi-supervised object detection with unlabeled data[C]. 14th International Conference on Computer Vision Theory and Applications, 2019: 289-296. [23] Huang S W, Lin C T, Chen S P, et al. AugGAN: cross domain adaptation with GAN-based data augmentation[C]. European Conference on Computer Vision, 2018: 731-744. [24] Caruana R. Multitask learning[J]. Machine Learning, 1997, 28: 41-75. [25] Sermanet P, Eigen D, Zhang X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks[EB/OL]. https://arxiv.org/pdf/1312.6229.pdf. [26] Dong X Y, Zheng L, Ma F, et al. Few-example object detection with model communication[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2019, 41(7): 1641-1654. [27] Luo Z L, Zou Y L, Hoffman J, et al. Label efficient learning of transferable representations across domains and tasks[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 164-176. [28] Chabot F, Chaouch M, Rabarisoa J, et al. Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image[C].IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1827-1836. [29] Zhang K P, Zhang Z P, Li Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. [30] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012: 1097-1105. [31] Pérez-Rúa J M, Zhu X T, Hospedales T, et al. Incremental few-shot object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 13843-13852. [32] Zhou X Y, Wang D Q, Krähenbühl P. Objects as points[EB/OL]. https://arxiv.org/pdf/1904.07850.pdf. [33] Peng C, Zhao K, Lovell B C. Faster ILOD: incremental learning for object detectors based on Faster RCNN[J]. Pattern Recognition Letters, 2020,140: 109-115. [34] Shmelkov K, Schmid C, Alahari K. Incremental learning of object detectors without catastrophic forgetting[EB/OL]. IEEE International Conference on Computer Vision, 2017: 3420-3429. [35] Li D W, Tasci S, Ghosh S, et al. RILOD: near real-time incremental learning for object detection at the edge[C]. Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019: 113-126. [36] Pan S J, Tsang I W, Kwok J T, et al. Domain adaptation via transfer component analysis[J].IEEE Transactions on Neural Networks, 2011, 22(2): 199-210. [37] Kang B Y, Liu Z, Wang X, et al. Few-shot object detection via feature reweighting[C]. IEEE/CVF International Conference on Computer Vision, 2019: 8419-8428. [38] Wang T, Zhang X P, Yuan L, et al. Few-shot adaptive Faster R-CNN[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 7166-7175. [39] Zhang J Y, Chen Z L, Huang J Y, et al. Few-shot domain adaptation for semantic segmentation[C]. Proceedings of the ACM Turing Celebration Conference, 2019: 1-6. [40] Motiian S, Jones Q, Iranmanesh S M, et al. Few-shot adversarial domain adaptation[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6673-6683. [41] Chen H, Wang Y, Wang G Y, et al. LSTD: a low-shot transfer detector for object detection[EB/OL]. https://arxiv.org/pdf/1803.01529.pdf. [42] Schwartz E, Karlinsky L, Shtok J, et al. RepMet: representative-based metric learning for classification and one-shot object detection[EB/OL]. https://arxiv.org/pdf/1806.04728.pdf. [43] Hao F S, Cheng J, Wang L, et al. Instance-Level embedding adaptation for few-shot learning[J]. IEEE Access, 2019, 7: 100501-100511.. [44] Musgrave K, Belongie S, Lim S N. A metric learning reality check[C]. European Conference on Computer Vision,2020: 681-699. [45] Singh B, Davis L S. An analysis of scale invariance in object detectio—SNIP[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3578-3587. [46] Wang G T, Luo C, Sun X Y, et al. Tracking by instance detection: a Meta-learning approach[EB/OL]. https://arxiv.org/pdf/2004.00830.pdf. [47] Fu K, Zhang T F, Zhang Y, et al. Meta-SSD: towards fast adaptation for few-shot object detection with Meta-learning[J]. IEEE Access, 2019, 7: 77597-77606. [48] Yan X P, Chen Z L, Xu A, et al. Meta R-CNN : towards general solver for instance-level low-shot learning[EB/OL]. https://arxiv.org/pdf/1909.13032.pdf. [49] Zhu X J, Goldberg A B. Introduction to semi-supervised learning[M]. San Rafael: Morgan and Claypool Publis-hers, 2009. [50] He H B, Garcia E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263-1284. [51] Lampert C H, Nickisch H, Harmeling S. Learning to detect unseen object classes by between-class attribute transfer[C].IEEE Conference on Computer Vision and Pattern Recognition, 2009: 951-958. [52] Franceschi L, Frasconi P, Salzo S, et al. Bilevel programming for hyperparameter optimization andMeta-learning[C]. Proceedings of the 35th International Conference on Machine Learning, 2018: 1568-1577. [53] Kirkpatricka J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(13): 3521-3526. [54] Kanter J M, Veeramachaneni K. Deep feature synthesis: towards automating data science endeavors[C].IEEE International Conference on Data Science and Advanced Analytics, 2015: 717-726. [55] Kotthoff L, Thornton C, Hoos H H, et al. Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA[J]. Journal of Machine Learning Research, 2017, 18(25): 1-5. [56] Zoph B, Le Q V. Neural architecture search with reinforcement learning[C]. Proceedings of the 34th Interna-tional Conference on Machine Learning, 2017: 459-468. [57] Kaiser Ł, Nachum O, Roy A, et al. Learning to remem-ber rare events[EB/OL]. https://arxiv.org/pdf/1703.03129.pdf.