Tools and Demos

The Multimedia Commons effort includes interactive demos that show how the dataset is being used, along with analysis and retrieval tools to help researchers take advantage of this massive resource. Some of these tools are included in the resources hosted on Amazon Web Services, while other tools and demos arise from independent efforts and are hosted externally.

Jump to:
The YFCC100M Browser
audioCaffe: Audio Analysis With Deep Neural Nets
videosearch100M: Semantic Search
YFCC100M Visual Search Utility
MI-File CBIR: Similarity-Based Image Retrieval
Evento360: Discovering Social Events
Other Demos and Projects
Sample image: Sunset with casino sign

The YFCC100M Browser

An alternate tool for visualizing and accessing the data in the YFCC100M is the YFCC100M Browser, developed at DFKI (German Research Center for Artificial Intelligence) and University of Kaiserslautern. (Not hosted on AWS.)

Check out the YFCC100M browser here.

The YFCC100M Browser allows you to query the YFCC100M metadata in real time. You can visualize the distribution of media with particular tags across users, times, and places; view the original media with those tags on Flickr; and download the metadata for your query results.

Citation: Sebastian Kalkowski, Damian Borth, Christian Schulze, and Andreas Dengel. 2015. Real-time Analysis and Visualization of the YFCC100M Dataset. In Proceedings of the ACM Multimedia 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions (MMCommons ’15). PDF.

audioCaffe: Audio Analysis With Deep Neural Nets

The audioCaffee audio-content analysis tool and a demonstration experiment may be found in the tools/audioCaffe/ directory in our AWS S3 data store, and is also included in the Multimedia Commons CloudFormation Template. The demonstration experiment — a MED-ium Cup of audioCaffe — uses data from the YLI Multimedia Event Detection (MED) subcorpus. It will give you a taste of what you can do with a big corpus of computed audio features like YLI and a flexible set of analysis tools.

The directory also includes a build of audioCaffe; you can check for updates at GitHub. audioCaffe is a deep neural net-based audio content analysis tool that leverages the deep-learning framework Caffe. audioCaffe is an open-source resource being developed as part of the SMASH project, which aims to provide a single software framework for a variety of content-analysis tasks (speech recognition, audio analysis, video event detection, etc.).

Citation: Khalid Ashraf, Benjamin Elizalde, Forrest Iandola, Matthew Moskewicz, Gerald Friedland, Kurt Keutzer, and Julia Bernd. 2015. Audio-Based Multimedia Event Detection with DNNs and Sparse Sampling. In Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR ’15). New York: ACM, 611-614.

Cheers to: This cup of audioCaffe was grown, roasted, ground, and brewed by Khalid Ashraf, Kurt Keutzer, Gerald Friedland, Benjamin Elizalde, and Jia Yangqing.

Funding: AudioCaffe is funded by a National Science Foundation grant for the SMASH project (grant IIS-1251276). (Any opinions, findings, and conclusions expressed on this website are those of the individual researchers and do not necessarily reflect the views of the funders.)

Contacts: Questions can be directed to ashrafkhalid[chez]berkeley[stop]edu.

videosearch100M: Semantic Search

A team at Carnegie Mellon University has been building a video search system based on matching (relatively) simple semantic concepts like dancing with visual and motion features (including convolutional neural network features and dense trajectories). They added features and annotations extracted from the ~800,000 YFCC100M videos to their system, and have released those resources publically.

Watch this space for updates on (hopefully) getting the full set of new features on AWS!

YFCC100M Visual Search Utility

Researchers at the Information Technologies Institute in Thermi, Greece, have produced a nearest-neighbor based visual search utility for the YFCC100M, using an IVFPQ index based on SURF+VLAD features.

Check out the YFCC100M Visual Search Utility here.

Available in the Multimedia Commons S3 data store on AWS:

  • The index file and the learning files for the Visual Search Utility: features/features/image/vgg-vlad-yfcc/vlad/ and /
  • The SURF+VLAD features used as the basis for the Utility: features/features/image/vgg-vlad-yfcc/vlad/full/ — see the Other Feature Corpora page for a description.

Citation: Adrian Popescu, Eleftherios Spyromitros-Xioufis, Symeon Papadopoulos, Hervé Le Borgne, and Yiannis Kompatsiaris. Towards an Automatic Evaluation of Retrieval Performance With Large Scale Image Collections. In Proceedings of the ACM Multimedia 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions (MMCommons ’15), Brisbane, Australia, October 2015. PDF of Preprint.

MI-File CBIR: Similarity-Based Image Retrieval

A research group at the Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” (ISTI) has developed a similarity-based search engine for the YFCC100M that retrieves YFCC100M images based on visual similarity to a query image and predicts the most appropriate tag. The system uses deep neural network features and the Metric Inverted File technique.

Check out the MI-File CBIR Search Engine and Tag Predictor here.

The Hybrid-CNN features that were used as the basis for the search engine are available in the Multimedia Commons S3 data store on AWS, under features/features/image/hybrid-cnn/. See the Other Feature Corpora page for a description.

For More Information: See the ISTI Deep Feature Corpus website.

Evento360: Discovering Social Events

The YFCC100M images were used for a Grand Challenge sponsored by Yahoo at ACM Multimedia 2015. Participants were asked to build systems to automatically detect particular social or cultural events in the YFCC100M dataset, analyze their structure, and summarize them. The Grand Challenge provided an example of how having a freely available web-scale multimedia dataset like YFCC100M can move research forward at the level of a whole community.

ICSI researchers participated in the Challenge, producing a retrieval system called “Evento360” that uses hierarchical clustering based on both visual and audio information, as well as clustering of metadata. Link to demo coming soon! We’re having technical difficulties.

Other Demos and Projects

Demos and Tools Using Multimedia Commons Data:

  • The YFCC100M images were used by researchers at University of North Carolina as the basis for a tool that creates 3D reconstructions of landmarks from diverse sets of user-generated images.
  • The ISI Foundation developed Flickr Cities, a tool for visualizing what types of images get captured in particular cities at different times and seasons.

The projects listed here are only a sampling! Check out lots of other…
Research in the Multimedia Commons: