When a multi-label classifier outputs a real-valued score for each class, a well known design strategy consists of tuning the corresponding decision thresholds by optimising the performance measure of interest on validation data.
In this paper we focus on the F-measure, which is widely used in multi-label problems.
We derive two properties of the micro-averaged F measure, viewed as a function of the threshold values, which allow its global maximum to be found by an optimisation strategy with an upper bound on computational complexity of O(n2 N2), where N and n are respectively the number of classes and of validation samples.
So far, only a suboptimal threshold selection rule and a greedy algorithm without any optimality guarantee were known for this task.
We then devise a possible optimisation algorithm based on our strategy, and evaluate it on three benchmark, multi-label data sets.