Saving A Feature Vector For New Data In Scikit-learn
To create a machine learning algorithm I made a list of dictionaries and used scikit's DictVectorizer to make a feature vector for each item. I then created an SVM model from a dat
Solution 1:
How do I get new data to conform to the dimensions of the training vectors?
By using the transform
method instead of fit_transform
. The latter learns a new vocabulary from the data set you feed it.
Is there a way to use pickle to save the feature vector?
Pickle the trained vectorizer. Even better, make a Pipeline
of the vectorizer and the SVM and pickle that. You can use sklearn.externals.joblib.dump
for efficient pickling.
(Aside: the vectorizer is faster if you pass it the boolean True
rather than the string "True"
.)
Post a Comment for "Saving A Feature Vector For New Data In Scikit-learn"