End-To-End Generative Pretraining For Multimodal Video Captioning

Endtoend Generative Pretraining for Multimodal Video Captioning DeepAI

End-To-End Generative Pretraining For Multimodal Video Captioning. Web objective effectively transfers to multimodal video captioning and outperforms the state of the art by a margin.

Endtoend Generative Pretraining for Multimodal Video Captioning DeepAI
Endtoend Generative Pretraining for Multimodal Video Captioning DeepAI

Web objective effectively transfers to multimodal video captioning and outperforms the state of the art by a margin.

Web objective effectively transfers to multimodal video captioning and outperforms the state of the art by a margin. Web objective effectively transfers to multimodal video captioning and outperforms the state of the art by a margin.