Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So "bag of words" = "count/tf-idf vectorizer + logistic/ridge/lasso regression"?

Also: a vector space is a set of things that can be added and multiplied by a scalar. So a vector space model should be the proverbial representation where "queen - woman + man = king".

Am I being an insufferable pedant? I follow text analysis only very lightly and keep losing the thread.



Yes, in the context of text classification a bag of words model will refer to that, or combined with some other linear model like linear SVM or naive bayes.

The queen - woman example is when you try to make a model of word semantics, such as with word2vec. In a document classification task the vectors represent documents.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: