The inconsistency among the unlabeled examples:
The inconsistency between labeled and unlabeled examples:
Objective function:
The optimal pseudo labels can be found by minimizing , formally
This is a convex optimaiztion problem and can be solved effectively by numerical methods, which have nothing to do with your learning algorithm .
Now we do want to involve our learning algorithm and use the same idea of problem to improve . On the other hand, we want to put problem into a machine learning scenario and still find optimal .
Suppose you are going to solve problem with gradient. Therefore during every step of your iteration, you update to get a smaller . In the machine learning scenario, what you update is a classifier which would predict and pseudo-label to get a smaller .
That is to say, we are going to substitute to or even .
We further expand our machine learning scenario to involve an ensemble of classifiers:
At the iteration, our goal is to find:
To simplify the notation, define , and .
Thus
Expand by substituding with :
Now the problem becomes:
Problem involves products of and , making it nonlinear and, hence, difficult to optimize. We are going to apply Bound-Optimization below to solve this problem.
We first further expand :
Then we find an upper bound of :
Flip and of the first term in :
Define:
Note that when calculating and , is fixed and and are functions of .
and can be interpreted as the confidence in classifying the unlabeled example into the positive class and the negative class, respectively.
Problem is equivalent to
The expression in is difficult to optimize since the weight and the classifier are coupled together. We simplify the problem furhter using the upper bound of .
Problem is equivalent to
Comments