Movie IMDb Rating Predictor

EECS 349, Machine Learning

Zhilin Chen

Conclusion

    Generally speaking, we are really successful in our predictor for the following reason:

         1) Plain analysis provides us important and enough insight of the current movie industry. For example, top K genres/directors/writers/actors in specific years/period.

         2) Our model has achieved really high accuracy(compared to only 38.62% in ZeroR) in out problem.

         3) This model is steady as it seems that we don't need to worry about under-fitting and over-fitting problems.

         4) When the training set is small, our model still can obtain satisfying accuracy on predicting the imdb ratings of given movie.

Suggestion for future work

     We should keep enlarging the dataset for our predicator. We believe that if it's large enough, we are able to achieve satisfying result withou any pre-procession and we might figure out relation between specific writers/actors/diretors with the IMDb ratings.

     In our problem, we treat it as classification problem. It would be better if we could treat it a regression problem and then figure out a more specific model to predict the exact IMDb rating of movie.