How To Seed The Random Number Generator For Scikit-learn?
I'm trying to write a unit test for some of my code that uses scikit-learn. However, my unit tests seem to be non-deterministic. AFAIK, the only places in my code where scikit-lear
Solution 1:
from sklearn import datasets, linear_modeliris= datasets.load_iris()
(X, y) = iris.data, iris.targetRANDOM_SEED=5
lr = linear_model.LogisticRegression(random_state=RANDOM_SEED)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=RANDOM_SEED)
lr.fit(X_train, y_train)
lr.score(X_test, y_test)
produced 0.93333333333333335
several times now. The way you did it seems ok. Another way is to set np.random.seed()
or use Sacred for documented randomness. Using random_state
is what the docs describe:
If your code relies on a random number generator, it should never use functions like
numpy.random.random
ornumpy.random.normal
. This approach can lead to repeatability issues in unit tests. Instead, anumpy.random.RandomState
object should be used, which is built from arandom_state
argument passed to the class or function.
Post a Comment for "How To Seed The Random Number Generator For Scikit-learn?"