Fast R-CNN

그냥 R-CNN은 이런가봄 : R-CNN first finetunes a ConvNet on object proposals using log loss. Then, it fits SVMs to ConvNet features. These SVMs act as object detectors, replacing the softmax classifier learnt by fine-tuning. In the third training stage, bounding-box regressors are learned. … Detection with VGG16 takes 47s / image (on a GPU). 이야 ~

그냥 R-CNN은 object proposal마다 cnn forward하는데, SPPnets^[1]가 미리 cnn돌려놓고 거기서부터 feature뽑아내는 식으로 개선했다.