Feature Selectors¶
DiffVariableSelector¶
-
class
bcselector.variable_selection.
DiffVariableSelector
[source]¶ Bases:
bcselector.variable_selection._VariableSelector
Ranks all features in dataset with difference cost filter method.
Methods Summary
fit
(data, target_variable, costs, lamb[, …])Ranks all features in dataset with difference cost filter method.
score
(model, scoring_function, **kwargs)plot_scores
([budget, …])Methods Documentation
-
fit
(data, target_variable, costs, lamb, j_criterion_func='cife', number_of_features=None, budget=None, stop_budget=False, **kwargs)[source]¶ Ranks all features in dataset with difference cost filter method.
- Parameters
data (np.ndarray or pd.) – Matrix or data frame of data that we want to rank features.
target_variable (np.ndarray or pd.core.series.Series) – Vector or series of target variable. Number of rows in data must equal target_variable length
costs (list or dict) – Costs of features. Must be the same size as columns in data. When using data as np.array, provide costs as list of floats or integers. When using data as pd.DataFrame, provide costs as list of floats or integers or dict {‘col_1’:cost_1,…}.
lamb (int or float) – Cost scaling parameter. Higher lambda is, higher is the impact of the cost on selection.
j_criterion_func (str) – Method of approximation of the conditional mutual information Must be one of [‘mim’,’mifs’,’mrmr’,’jmi’,’cife’]. All methods can be seen by running: >>> from bcselector.information_theory.j_criterion_approximations.__all__
number_of_features (int) – Optional argument, constraint to selected number of features.
budget (int or float) – Optional argument, constraint to selected total cost of features.
stop_budget (bool) – Optional argument, TODO - must delete this argument
**kwargs – Arguments passed to difference_find_best_feature() function and then to j_criterion_func.
Examples
>>> from bcselector.variable_selection import DiffVariableSelector >>> dvs = DiffVariableSelector() >>> dvs.fit(X, y, costs, lamb=1, j_criterion_func='mim')
-
score
(model, scoring_function, **kwargs)¶
-
plot_scores
(budget=None, compare_no_cost_method=False, savefig=False, annotate=False, annotate_box=False, figsize=(12, 8), bbox_pos=(0.72, 0.6), plot_title=None, x_axis_title=None, y_axis_title=None, **kwargs)¶
-
get_cost_results
()¶
-
get_no_cost_results
()¶
-