Note for video Machine Learning and Data Mining——training vs Testing

时间:2023-03-09 00:42:14
Note for video Machine Learning and Data Mining——training vs Testing

Here is the note for lecture five.



There will be several points 



1. Training and Testing 

Both of these are about data. Training is using the data to get a fine hypothesis, and testing is not.

If we get a final hypothesis and want to test it, it turns to testing.



2. Another way to verify that learning is feasible. Firstly, let me show you an inequlity.

Note for video Machine Learning and Data Mining——training vs Testing

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveXVtYW8xOTkyMTAwNg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="" style="text-align:center">

As it mentions on note 2, in the inequlity, the complexity of your hypothesis can be reflected by M. 

However, M is almost meaningless, and because of this, your hypothesis will be useless.
If we can replace 

M with another quantity, and the quantity is not meaningless, that means not infinite, and then we can start

our learning in an actual model.(our learning is feasible)

What is M? It mentioned before that M is the maxnum of hypothesis. So can we figure number of hypothesis to 

replace M? The answer turns true.

the maxnum of hypothesis are different choice of different points. If the number of uncertain is a, and the number

of choice for uncertain is b, then the maxnum of hypothesis come out, its a^b.

But it seems not smoothly like that, there are several hypothesis could not be built up,
generlly the number of hypothesis 

that can be built are less than a^b.

Let's come back to the inequlity, we can prove it mathematically that
if M can be replaced by a polynomial, that means the number of hypothesis in a set is not infinite, then we can declare that learning is feasible using this hypothesis set. There is a new statement that wil be proved next lecture, if the maxnum of hypothesis
is less than its max-value, the number of hypothesis could be replaced by a polynimial, that is, learning is feasible using the hypothesis set.

According to above statement, if there are several hypothesis can not be built up, then set for the hypothesis will be feasible for learning.