*더 spatial localization 이 강하게 요구되는 task 란, object detection 문제에서는 box 의 threshold 를 통해 맞췄다 못맞췄다를 판가름하게 되는데, high box overlap thresholds when training from scratch 같은 것들이 여기 속한다

**근거 : Related Work, 2p, Recent work pushes this paradigm further by pre-training on datasets that are 6× (ImageNet-5k [14]), 300× (JFT [44]), and even 3000×(Instagram [30]) larger than ImageNet. While this body of work demonstrates significant improvements on image classification transfer learning tasks, the improvements on object detection are relatively small (on the scale of +1.5 AP on COCO with 3000× larger pre-training data [30]).

***실험 :