The Multimedia Commons (MMCommons) initiative is a community formed to coordinate efforts to advance the field of multimedia. Most of our attention is currently directed at making the Yahoo-Flickr Creative Commons 100 Million (YFCC100M) dataset even more useful, by offering a repository that contains supplemental material to this collection, such as content, features, and annotations.

The YFCC100M is the largest publicly and freely usable multimedia collection, containing  around 99.2 million photos and 0.8 million videos from Flickr, all of which were shared under one of the various Creative Commons licenses. This dataset, however, only includes the metadata of the photos and videos (e.g. the photographers that captured them, the cameras that were used, the locations where they were taken if available, etc.) and does not include their actual content (i.e. the image and video files).

To make it easier for everyone, we therefore downloaded all of the photos and videos in the dataset from Flickr, processed their content to generate additional data (e.g. visual features, ground truth annotations) that researchers often find useful, and released utilities and tools to assist with using and visualizing the dataset. We have made all of this material available as part of the Multimedia Commons initiative.

Sample image: Lanterns and people at a festival Sample image: Lightning over a city Sample image: Person holding mirror showing tree Sample image: Cosmos flowers

What’s the difference between the YFCC100M dataset and the Multimedia Commons repository?

The Multimedia Commons contains supplemental material to the YFCC100M, including the YLI feature corpus and annotations. If you want to perform experiments or develop applications for which you only need metadata (e.g. a keyword-based search engine) then probably all you need is the YFCC100M. However, if you are planning on using the visual, aural, or temporal content of the photos or videos, then you likely also need what the MMCommons offers.

How do I get access to the YFCC100M dataset and the Multimedia Commons repository?

You can read detailed instructions on how to access the YFCC100M dataset here, and on how to access the Multimedia Commons repository here.

How can I contribute to the Multimedia Commons repository?

If you are using the YFCC100M and want to help us out by sharing any features and annotations you generated, please contact us!