Multi-class Image Classification - Sparsity does it Better
Sean Ryan Fanello, Nicoletta Noceti, Giorgio Metta, Francesca Odone
2013
Abstract
It is well assessed that sparse representations improve the overall accuracy and the systems performances of many image classification problems. This paper deals with the problem of finding sparse and discriminative representations of images in multi-class settings. We propose a new regularized functional, which is a modification of the standard dictionary learning problem, designed to learn one dictionary per class. With this new formulation, while positive examples are constrained to have sparse descriptions, we also consider a contribution from negative examples which are forced to be described in a denser and smoother way. The descriptions we obtain are meaningful for a given class and highly discriminative with respect to other classes, and at the same time they guarantee real-time performances. We also propose a new approach to the classification of single image features which is based on the dictionary response. Thanks to this formulation it is possible to directly classify local features based on their sparsity factor without losing statistical information or spatial configuration and being more robust to clutter and occlusions. We validate the proposed approach in two image classification scenarios, namely single instance object recognition and object categorization. The experiments show the effectiveness in terms of performances and speak in favor of the generality of our method.
References
- Bay, H., Ess, A., Tuytelaars, T., and Vangool, L. (2008). Speeded-up robust features. CVIU, 110:346-359.
- Boureau, Y.-L., Bach, F., LeCun, Y., and Ponce, J. (2010). Learning mid-level features for recognition. In CVPR.
- Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR.
- Destrero, A., De Mol, C., Odone, F., and A., V. (2009). A sparsity-enforcing method for learning face features. IP, 18:188-201.
- Fei-fei, L., Fergus, R., and Perona, P. (2006). One-shot learning of object categories. PAMI, 28:594-611.
- Hasler, S., Wersing, H., Kirstein, S., and Körner, E. (2009). Large-scale real-time object identification based on analytic features. In ICANN.
- Hasler, S., Wersing, H., and Krner, E. (2007). A comparison of features in parts-based object recognition hierarchies. ICANN.
- Jia, Y., Huang, C., and Darrel, T. (2012). Beyond spatial pyramids: Receptive field learning for pooled image features. In CVPR.
- Lee, H., Battle, A., Raina, R., and Ng, A. Y. (2007). Efficient sparse coding algorithms. In NIPS.
- Lowe, D. G. (2004). Distinctive image features from scaleinvariant keypoints. IJCV, 60:91-110.
- Luenberger, D. G. (2008). Linear and Nonlinear Programming. Springer.
- Mairal, J., Bach, F., and Ponce, J. (2012). Task-driven dictionary learning. PAMI, 34:791-804.
- Mairal, J., Bach, F., Ponce, J., Sapiro, G., and Zisserman, A. (2008a). Discriminative learned dictionaries for local image analysis. In CVPR.
- Mairal, J., Bach, F., Ponce, J., Sapiro, G., and Zisserman, A. (2008b). Supervised dictionary learning. In NIPS.
- Olshausen, B. A. and Fieldt, D. J. (1997). Sparse coding with an overcomplete basis set: a strategy employed by v1. Vision Research.
- Peyré, G. (2009). Sparse modeling of textures. Journal of Mathematical Imaging and Vision, pages 17-31.
- Skretting, K. and Husy, J. (2006). Texture classification using sparse frame based representation. EURASIP Journal on Applied Signal Processsing.
- Vapnik, V. (1998). Statistical Learning Theory. John Wiley and Sons, Inc.
- Viola, P. and Jones, M. (2004). Robust real-time face detection. IJCV, 57:137-154.
- Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. (2010). Locality-constrained linear coding for image classification. In CVPR.
- Wersing, H. and Körner, E. (2003). Learning optimized features for hierarchical models of invariant object recognition. Neural Computation.
- Yang, J., Wright, J., Ma, Y., and Sastry, S. (2008). Feature selection in face recognition: A sparse representation perspective. PAMI.
- Yang, J., Yu, K., Gong, Y., and Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In CVPR.
- Yang, J., Yu, K., and Huang, T. (2010). Efficient highly over-complete sparse coding using a mixture model. In ECCV.
Paper Citation
in Harvard Style
Fanello S., Noceti N., Metta G. and Odone F. (2013). Multi-class Image Classification - Sparsity does it Better . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 800-807. DOI: 10.5220/0004295908000807
in Bibtex Style
@conference{visapp13,
author={Sean Ryan Fanello and Nicoletta Noceti and Giorgio Metta and Francesca Odone},
title={Multi-class Image Classification - Sparsity does it Better},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},
year={2013},
pages={800-807},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004295908000807},
isbn={978-989-8565-47-1},
}
in EndNote Style
TY  - CONF 
JO  - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI  - Multi-class Image Classification - Sparsity does it Better
SN  - 978-989-8565-47-1
AU  - Fanello S. 
AU  - Noceti N. 
AU  - Metta G. 
AU  - Odone F. 
PY  - 2013
SP  - 800
EP  - 807
DO  - 10.5220/0004295908000807