Importance
PermutationImportance(scoring=None, n_repeats=5, n_jobs=None, random_state=None, sample_weight=None, max_samples=1.0)
Wrapper around sklearn.inspection.permutation_importance.
Parameters:
-
scoring
(str, callable, list, tuple, or dict
, default:None
) –Scorer to use. If
scoring
represents a single score, one can use: - a single string; - a callable that returns a single value. Ifscoring
represents multiple scores, one can use: - a list or tuple of unique strings; - a callable returning a dictionary where the keys are the metric names and the values are the metric scores; - a dictionary with metric names as keys and callables a values. Passing multiple scores toscoring
is more efficient than callingpermutation_importance
for each of the scores as it reuses predictions to avoid redundant computation. If None, the estimator's default scorer is used. -
n_repeats
(int
, default:5
) –Number of times to permute a feature.
-
n_jobs
(int or None
, default:None
) –Number of jobs to run in parallel. The computation is done by computing permutation score for each columns and parallelized over the columns.
None
means 1 unless in a :obj:joblib.parallel_backend
context.-1
means using all processors. -
random_state
(int, RandomState instance
, default:None
) –Pseudo-random number generator to control the permutations of each feature. Pass an int to get reproducible results across function calls.
-
sample_weight
(array-like of shape (n_samples,)
, default:None
) –Sample weights used in scoring.
-
max_samples
(int or float
, default:1.0
) –The number of samples to draw from X to compute feature importance in each repeat (without replacement). - If int, then draw
max_samples
samples. - If float, then drawmax_samples * X.shape[0]
samples. - Ifmax_samples
is equal to1.0
orX.shape[0]
, all samples will be used. While using this option may provide less accurate importance estimates, it keeps the method tractable when evaluating feature importance on large datasets. In combination withn_repeats
, this allows to control the computational speed vs statistical accuracy trade-off of this method.
Source code in felimination/importance.py
__call__(estimator, X, y)
Computes the permutation importance.
Parameters:
-
estimator
(object
) –An estimator that has already been fitted and is compatible with scorer.
-
X
((ndarray or DataFrame, shape(n_samples, n_features))
) –Data on which permutation importance will be computed.
-
y
((array - like or None, shape(n_samples) or (n_samples, n_classes))
) –Targets for supervised or
None
for unsupervised.
Returns:
-
importances_mean
(ndarray of shape (n_features, )
) –Mean of feature importance over
n_repeats
.