Automatic Image Caption Generation: study and implementation

korichi, Safa batoul; aimene, Karim

Automatic Image Caption Generation: study and implementation

Files

RapportMemoire__1_.pdf (4.18 MB)

Date

2021

Authors

korichi, Safa batoul

aimene, Karim

Publisher

université Ghardaia

Abstract

Artificial Intelligence (AI) is currently moving increasingly towards multimodal learning which involve build system that can process information from multiple sources, such as text, images or audio. Image captioning is one of the main visual-linguistic tasks that requires generating captions to a specific image. The challenge is to create a unified Deep Learning (DL) model, suitable to describe an image in a correct sentence. To do so, we need to understand the proper way to visualize the text in a certain space. We used the new term of Transformer that brings a new concept into a sequence to sequence mechanism, we also include the power of modern GPU in processing data in an efficient and faster manner. In this path, we have experimented with a Transformer-based approach and applied it to the image captioning problem using MS COCO dataset.

Keywords

Multimodal Learning, Image captioning, Deep Learning (DL), Transformer, Sequence to sequence, MS-COCO

URI

https://dspace.univ-ghardaia.edu.dz/xmlui/handle/123456789/1055

Collections

Mémoires de Master

Full item page

Automatic Image Caption Generation: study and implementation

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By