2017년 8월 4일 (금) 14:24 판

Fast R-CNN

그냥 R-CNN^[1]은 이런가봄 : R-CNN first finetunes a ConvNet on object proposals using log loss. Then, it fits SVMs to ConvNet features. These SVMs act as object detectors, replacing the softmax classifier learnt by fine-tuning. In the third training stage, bounding-box regressors are learned. … Detection with VGG16 takes 47s / image (on a Nvidia K40 GPU overclocked to 875 MHz.). 이야 ~

그냥 R-CNN은 object proposal마다 cnn forward하는데, SPPnets^[2]가 미리 cnn돌려놓고 거기서부터 feature뽑아내는 식으로 test time은 10~100배, training time도 3배정도 개선했다. 단, SPPnets는 R-CNN과 달리 spatial pyramid pooling앞의 convolutional layers를 update할 수 없다.

↑ R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
↑ K.He, X.Zhang, S.Ren, and J.Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV,2014.

[r9-1] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.

[r11-2] K.He, X.Zhang, S.Ren, and J.Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV,2014.

[1]

[2]

2017년 8월 4일 (금) 14:22 판 (원본 보기) Admin (토론 \| 기여) 잔글 ← 이전 편집		2017년 8월 4일 (금) 14:24 판 (원본 보기) Admin (토론 \| 기여) 잔글 다음 편집 →
2번째 줄:		2번째 줄:
	그냥 R-CNN<ref name=r9>R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.</ref>은 이런가봄 : R-CNN first finetunes a ConvNet on object proposals using log loss. Then, it fits SVMs to ConvNet features. These SVMs act as object detectors, replacing the softmax classifier learnt by fine-tuning. In the third training stage, bounding-box regressors are learned. … Detection with VGG16 takes 47s / image (on a Nvidia K40 GPU overclocked to 875 MHz.). 이야 ~		그냥 R-CNN<ref name=r9>R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.</ref>은 이런가봄 : R-CNN first finetunes a ConvNet on object proposals using log loss. Then, it fits SVMs to ConvNet features. These SVMs act as object detectors, replacing the softmax classifier learnt by fine-tuning. In the third training stage, bounding-box regressors are learned. … Detection with VGG16 takes 47s / image (on a Nvidia K40 GPU overclocked to 875 MHz.). 이야 ~

−	그냥 R-CNN은 object proposal마다 cnn forward하는데, SPPnets<ref name=r11>K.He, X.Zhang, S.Ren, and J.Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV,2014.</ref>가 미리 cnn돌려놓고 거기서부터 feature뽑아내는 식으로 test time은 10~100배, training time도 3배정도 개선했다.	+	그냥 R-CNN은 object proposal마다 cnn forward하는데, SPPnets<ref name=r11>K.He, X.Zhang, S.Ren, and J.Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV,2014.</ref>가 미리 cnn돌려놓고 거기서부터 feature뽑아내는 식으로 test time은 10~100배, training time도 3배정도 개선했다. 단, SPPnets는 R-CNN과 달리 spatial pyramid pooling앞의 convolutional layers를 update할 수 없다.

둘러보기 메뉴

"Fast RCNN"의 두 판 사이의 차이

2017년 8월 4일 (금) 14:24 판

Fast R-CNN