Abstract:Learning with limited data is a challenging field for computer visual recognition. Prototypes calculated by the metric learning method are inaccurate when samples are limited. In addition, the generalization ability of the model is poor. To improve the performance of few-shot image classification, the following measures are adopted. Firstly, to tackle the problem of limited samples, the masked autoencoder is used to enhance data. Secondly, prototypes are calculated by task-specific features, which are obtained by the multi-scale attention mechanism. The attention mechanism makes prototypes more accurate. Thirdly, the domain adaptation module is added with a margin loss function. The margin loss pushes different prototypes away from each other in the feature space. Sufficient margin space improves the generalization performance of the method. The experimental results show the proposed method achieves better performance on few-shot classification.
表 1 骨架网络的模型结构
Table 1. Structure of the backbone
模型结构 输出尺寸 ResNet-12 Conv4 卷积层1 42$ \times $42 [3×3,64] × 3 [3×3,64] 卷积层2 21$ \times $21 [3×3,160] × 3 [3×3,64] 卷积层3 10$ \times $10 [3×3,320] × 3 [3×3,64] 卷积层4 5$ \times $5 [3×3,640] × 3 [3×3,64] 池化层 1$ \times $1 5×5 Pool 5×5 Pool 参数量 50 MB 0.46 MB 表 2 MiniImageNet数据集置信度95%小样本分类准确率 (episodes为10000)
Table 2. Few-shot classification accuracies with 95 confidence interval on the miniImageNet dataset (the number of episodes is 10000)
模型 骨架网络 5-way 1-shot 5-way 5-shot Matching Net [12] Conv4 43.56±0.84 55.31±0.37 Proto Net [13] Conv4 49.42±0.78 68.20±0.66 Relation Net [21] Conv4 50.44±0.82 65.32±0.70 MAML [9] Conv4 48.70±1.84 63.11±0.92 DN4 [14] Conv4 51.24±0.74 71.02±0.64 DSN [23] Conv4 51.78±0.96 68.99±0.69 BOIL [25] Conv4 49.61±0.16 66.45±0.37 MADA(ours) Conv4 55.27±0.20 72.12±0.16 Matching Net [12] ResNet-12 65.64±0.20 78.73±0.15 Proto Net [13] ResNet-12 60.37±0.83 78.02±0.75 DN4 [14] ResNet-12 54.37±0.36 74.44±0.29 DSN [23] ResNet-12 62.64±0.66 78.73±0.45 SNAIL [22] ResNet-12 55.71±0.99 68.88±0.92 CTM [24] ResNet-12 64.12±0.82 80.51±0.14 MADA(ours) ResNet-12 67.45±0.20 82.77±0.13 表 3 TieredImageNet数据集置信度95%小样本分类准确率 (episodes为10000)
Table 3. Few-shot classification accuracies with 95 confidence interval on the tieredImageNet dataset (the number of episodes is 10000)
模型 骨架网络 5-way 1-shot 5-way 5-shot Matching Net [12] ResNet-12 68.50±0.92 80.60±0.71 Proto Net [13] ResNet-12 65.65±0.92 83.40±0.65 MetaOpt Net [26] ResNet-12 65.99±0.72 81.56±0.53 TPN [27] ResNet-12 59.91±0.94 73.30±0.75 CTM [24] ResNet-12 68.41±0.39 84.28±1.74 LEO [10] ResNet-12 66.63±0.05 81.44±0.09 MADA(ours) ResNet-12 70.67±0.22 85.10±0.15 表 4 CUB数据集置信度95%小样本分类准确率 (episodes为10000)
Table 4. Few-shot classification accuracies with 95 confidence interval on the CUB dataset (the number of episodes is 10000)
表 5 在miniImageNet数据集上置信度95%小样本分类的消融实验 (episodes为10000)
Table 5. Ablation study of few-shot classification accuracies with 95 confidence interval on the miniImageNet (the number of episodes is 10000)
网络 MA DA DE 5-way 1-shot 5-way 5-shot Baseline × × × 60.37±0.83 78.02±0.75 MA √ × × 65.84±0.23 81.94±0.34 MADA √ √ × 67.21±0.18 82.41±0.48 MADA+ √ √ √ 67.45±0.20 82.77±0.13 -
