So, today we are releasing the Crossmodal 3,600 dataset, which provides 261,375 reference captions in 36 languages for a geographically diverse set of 3,600 images.
So, today we are releasing the Crossmodal 3,600 dataset, which provides 261,375 reference captions in 36 languages for a geographically diverse set of 3,600 images.
Comments are closed.