Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Grab it
Learn how to build an image captioning system from scratch in this 36-minute tutorial. Explore the Flickr8k dataset and implement a model using PyTorch. Gain insights into combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for image captioning tasks. Follow along with code implementation, training setup, and error fixing. Discover potential improvements like using larger models, extended training, and incorporating attention mechanisms. Conclude with a brief evaluation of the implemented system.