1- Discriminant Functions: Perceptron

2- Probabilistic Generative Models: LDA, QDA

3- Probabilistic Discriminative Models: logistic regression ]]>

I like to add some points about difficulty of prediction.

the text mentioned:

“The difficulty lies in the noised data where same input x may result in different outputs y. That is because x is not enough for fully determining the value of y”

1- We don’t have full input space. (or We can not consider it!), so we are trying to predict y=f(x) with a subset of input space. For example consider trying to predict y=x^2 using only {(x=1,y=1),(x=2,y=4)}. There are lots of functions that can goes through this points.

2- we don’t know the good estimator! for example if you want to predict the mean of a population, you can use the sample mean estimator $$\bar{X}=\frac{1}{n}\sum_{i=1}^n{x_i}$$. It is an UNBIASED estimator. It means that when it feeds with very large data-set, it estimates the true mean better. but with learning models as an estimator if you have very large data-set and your model doesn’t have enough power to cope with it, your estimation could be bad (underfitting of model or high biased model). on the other hand if you have small number of data and your model is very powerful then your estimation could be bad again.(overfitting of the model or high variance).

I think this article creates a big picture for the rest of my study on this field.

I am new to this field(Unsupervised Feature Learning)

I try to learn about sparse coding through the article “Efficient sparse coding algorithms” NIPS 2006 . Is it a good place to start?