Large-Scale TV Dataset
Partial Video Copy Detection
(STVD-PVCD)
The STVD-PVCD dataset deals with the performance evaluation of partial video copy detection methods in computer vision. It is designed from a protocol with a TV capture [1, 2] ensuring a deeper scalability, a robust groundtruthing and a control of degradations for a fine performance evaluation. It is the largest public dataset on the task with a near 83k videos having a total duration of 10,660 hours.
The dataset is composed of a reference set, six test sets A to F (presenting different categories and levels of degradation) with the groundtruth. It is provided as different files containing:
- short reference videos with ids,
- long positive/negative videos for testing/training,
- the groundtruth with the reference ids, timestamps and durations.
The groundtruth is provided as a CSV file having the format
Ref_Video; Pos_Video; Ref_Length; Pos_Length; Start_Copy
where
Ref_Video
is the label / file name of the reference video,Pos_Video
is the label / file name of the positive video,-
Ref_Length
is the length of the reference video in number of frames with a 30 FPS rate, -
Pos_Length
is the length of the positive video in number of frames with a 30 FPS rate, that is we havePos_Length
>Ref_Length
, -
Start_Copy
is the index of the first frame of the reference video copy appearing in the positive video such asStart_Copy
∈ [0;Ref_Length
-1]
andStart_Copy
<Pos_Length
−Ref_Length
.
e.g. ref_a; pos_a; 112; 842; 100
The test sets A to F are detailed in [1, 2] and for short below.
- Set A: is a root capture to tune the characterization tasks.
- Set B: is a "hello world" test set.
- Set C: is a test set with scalability and pixel attack.
- Set D: is a test set with scalability and global transformations.
- Set E: applies video speeding with scalability.
- Set F: combines the test sets C, D and E.
For the needs of visualization and testing, some samples (reference, positive, negative videos with the grountruth) are given in the next table for the different test sets.
Reference | Positive | Groundtruth | Negative | |
sample A | ref_a | pos_a | gth_a | neg_a |
sample B | ref_b | pos_b | gth_b | neg_b |
sample C | ref_c | pos_c | gth_c | neg_c |
sample D | ref_d | pos_d | gth_d | neg_d |
sample E | ref_e | pos_e | gth_e | neg_e |
sample F | ref_f | pos_f | gth_f | neg_f |
The different files constituting the dataset are given below protected with a password. The dataset is available for non-commercial research purposes. Before to download the dataset, get the agreement (in english or french version) and sign it. Then, send the scanned version to Mathieu Delalandre . After verifying your request, we will contact you with the password to unzip the dataset.
The different files constituting the dataset are given here. We provide first the files for the reference videos and groundtruth. The test sets A to F are given in the next table (STVD is still under publication, the test set F will be delivered later).
Positive videos | Negative videos | Total duration (h) | Size (GiB) | Link | thumb | |
set A | 3,780 | 12,165 | 1,960 | 458 | download | |
set B | 3,780 | 3,780 | 860 | 18.6 | download | |
set C | 3,780 | 12,165 | 1,960 | 6.5 | download | |
set D | 3,780 | 12,165 | 1,960 | 20.8 | download | |
set E | 3,780 | 12,165 | 1,960 | 21.8 | download | |
set F | 3,780 | 12,165 | 1,960 | 16.1 | download |
NB. Our storage service at the UT delivers at 3-16 MB/s for downloading (from a low / high speed connection, respectively) with concurrent access.
For kick-off, we list here works with experiments on the STVD-PVCD dataset.
Set | Refs |
B |
[LVH2022],
........
|
C | [TNF2022], [LVH2022] |
D |
[LVH2023],
........
|
Please cite one of the following papers, in english [1] or french [2], if you use this dataset.
- V.H. Le, M. Delalandre and D. Conte. A large-Scale TV Dataset for partial video copy detection. International Conference on Image Analysis and Processing (ICIAP), Lecture Notes in Computer Science (LNCS), vol 13233, pp. 388-399, 2022.
- V.H. Le, M. Delalandre and D. Conte. Une large base de données pour la détection de segments de vidéos TV. Journées Francophones des Jeunes Chercheurs en Vision par Ordinateur (ORASIS), 2021.