👩💻 Join our community of thousands of amazing developers!
Posted by Xianzhi Du, Software Engineer and Jaeyoun Kim, Technical Program Manager, Google Research Convolutional neural networks created for image tasks typically encode an input image into a sequence of intermediate features that capture the semantics of an image (from local to global), where each subsequent layer has a lower spatial dimension. However, this scale-decreased model may not be able to deliver strong features for multi-scale visual recognition tasks where recognition and localizat...