Abstract
In real-world applications, e.g. law enforcement and video retrieval,
one often needs to search a certain person in long videos with
just one portrait.
This is much more challenging than the conventional settings for person
re-identification, as the search may need to be carried out in the environments
different from where the portrait was taken.
In this paper, we aim to tackle this challenge and propose a novel framework,
which takes into account the identity invariance along a tracklet,
thus allowing person identities to be propagated via
both the visual and the temporal links.
We also develop a novel scheme called Progressive Propagation via Competitive Consensus,
which significantly improves the reliability of the propagation process.
To promote the study of person search,
we construct a large-scale benchmark, which contains 127K
manually annotated tracklets from 192 movies.
Experiments show that our approach remarkably outperforms mainstream person re-id
methods, raising the mAP from 42.16% to 62.27%.
Progressive Propagation via Competitive Consensus
Competitive Consensus
Competitive Consensus is a novel scheme for label propagation.
Compared to the conventional linear diffusion, it improves the reliability by propagating the most confident information.
Progressive Propagation
Progressive Propagation is a simple but effective scheme to accelerate the propagation process and reduce the effects of noise,
which freezes the label of a certain fraction of nodes at each iteration according to their confidence.
Results
Accuracy on CSM
|
IN |
ACROSS |
mAP |
R@1 |
R@3 |
R@5 |
mAP |
R@1 |
R@3 |
R@5 |
FACE |
53.33 |
76.19 |
91.11 |
96.34 |
42.16 |
53.15 |
61.12 |
64.33 |
IDE |
17.17 |
35.89 |
72.05 |
88.05 |
1.67 |
1.68 |
4.46 |
6.85 |
FACE+IDE |
53.71 |
74.99 |
90.30 |
96.08 |
40.43 |
49.04 |
58.16 |
62.10 |
LP |
8.19 |
39.70 |
70.11 |
87.34 |
0.37 |
0.41 |
1.60 |
5.04 |
PPCC-v(ours) |
62.37 |
84.31 |
94.89 |
98.03 |
59.58 |
63.26 |
74.89 |
78.88 |
PPCC-vt(ours) |
63.49 |
83.44 |
94.40 |
97.92 |
62.27 |
62.54 |
73.86 |
77.44 |
Examples of Search Results
CSM Dataset
Cast Search in Movies (CSM) contains more than 127K tracklets (with more than 11M person instances)
of 1,218 cast from 192 movies.
Bounding box and indentity of key frames are manually annotated.
Bounding box of other instances of tracklets are gotten by a person detector based on Faster-RCNN.
Comparison between CSM and Related Datasets
Dataset |
CSM |
MARS |
iLIDS |
PRID |
Market |
PSD |
PIPA |
task |
search |
re-id |
re-id |
re-id |
re-id |
det.+re-id |
recog. |
type |
video |
video |
video |
video |
video |
image |
image |
identities |
1,218 |
1,261 |
300 |
200 |
1,501 |
8,432 |
2,356 |
tracklets |
127K |
20K |
600 |
400 |
- |
- |
- |
instances |
11M |
1M |
44K |
40K |
32K |
96K |
63K |
Examples of CSM
Download
Citation
@inproceedings{huang2018person,
title={Person Search in Videos with One Portrait Through Visual and Temporal Links},
author={Huang, Qingqiu and Liu, Wentao and Lin, Dahua},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
pages={425--441},
year={2018}
}
Contact