Visual Grounding in Video for Unsupervised Word Translation

1 · DeepMind · March 11, 2020, midnight

Summary

Our goal is to use visual grounding to improve unsupervised word mapping between languages. The key idea is to establish a common visual representation between two languages by learning embeddings from unpaired instructional videos narrated in the native language....

Read full post on www.deepmind.com →

AUTHOR