Image classification is intrinsically a multiclass, nonlinear classification task. Support Vector Machines (SVMs) have been successfully exploited to tackle this problem, using one-vs-one or one-vs-all learning schemes to enable multiclass classification, and kernels designed for image classification to handle nonlinearities. To classify an image at test time, an SVM requires matching it against a small subset of the training data, namely, its support vectors (SVs). In the multiclass case, though, the union of the sets of SVs of each binary SVM may almost correspond to the full training set, potentially yielding an unacceptable computational complexity at test time. To overcome this limitation, in this work we propose a well-principled reduction method that approximates the discriminant function of a multiclass SVM by jointly optimizing the full set of SVs along with their coefficients. We show that our approach is capable of reducing computational complexity up to two orders of magnitude without significantly affecting recognition accuracy, by creating a super-sparse, budgeted set of virtual vectors.