![]() |
DUBLIN CITY UNIVERSITY |
![]() |
||||
| Home | Publications | Patents | Presentations | PRES | ||
YouTube NBA Collection
This is a collection of 61,340 video pages from YouTube on NBA. Video pages are saved as XML files that contains all information about the video (title, tags, description, length, comments ...). The video pages were crawled on the 3rd of March 2009, which makes all video pages has the online information about the video till this date.
Associated with the data collection, there are a list of 40 topics with their relevance assessment. In addition, about 250,000 user profiles are provided. These profiles are for all users interacted with the video on the web by posting the video or commenting on it.
For more information about the collection, please refer to the published paper in LREC 2010.
|
Data |
Link |
Description |
|
NBA Collection |
A collection of 61,340
XML files. Each XML file represents a video page on YouTube about an NBA
video. The XML file name is the video ID, and it contains all the
information about the video, which are: |
|
|
Topics + Relevance Assessments |
A list of 40 topics on
NBA, and the relevance assessment of each topic. |
|
|
Users Profiles |
245,545 user profiles.
The profiles are for all user IDs that appeared in the collection, which
can be the users who posted videos, or the users who commented on the
videos. |
|
|
List of video IDs |
This is a text file that contains a list of all the video IDs in the collection. This file contains also the information of video pages in the collection that have been removed from the web (YouTube) for a reason of terms of use violation or removing by the user. The list identify more than 12k videos to be removed from the web among the 61k video pages collection |
Reference
W. Magdy, J. Min, J. Leveling, and G. J. F. Jones. Building a Domain-Specific Document Collection for Evaluating Metadata Effect on Information Retrieval. LREC 2010
[ Home | Publications | Patents | Presentations]
Last Modified: March 2010