Code documentation¶

class rulexai.explainer.RuleExplainer(model, X: DataFrame, y: Union[DataFrame, Series], type: str = 'classification')¶

Parameters

model (Model = Union[RuleClassifier, RuleRegressor, SurvivalRules, CN2UnorderedClassifier, CN2SDUnorderedClassifier, DecisionTreeClassifier, DecisionTreeRegressor, SurvivalTree, List[str]]) –
Model to be analyzed. RuleXai supports the following Rule models:
- RuleKit(https://adaa-polsl.github.io/RuleKit-python/): RuleClassifier, RuleRegressor, SurvivalRules
- Orange (https://orangedatamining.com/): CN2UnorderedClassifier, CN2SDUnorderedClassifier
It can also extract rules from decision trees:
- scikit-learn (https://scikit-learn.org/stable/): DecisionTreeClassifier, DecisionTreeRegressor
- scikit-survival (https://scikit-survival.readthedocs.io/en/stable/): SurvivalTree
Or you can provide a list of rules as:
- classification:
  IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN label_atrribute = {class_name}
- regression:
  IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN target_attribute = {value}
- survival:
  IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN survival_status_attribute = {survival_status}
X (pd.DataFrame) – The training dataset used during provided model training
y (Union[pd.DataFrame, pd.Series]) – The target values (class labels, real number, survival status) used during provided model training
type (str = None) –
The type of problem that the provided model solves. You can choose between:
- ”classification”
- ”regression”
- ”survival”
default: “classification”

condition_importances_¶

Computed conditions importances

Type: pd.DataFrame

feature_importances_¶

Feature importances computed base on conditions importances

Type: pd.DataFrame

explain(measure: str = 'C2', basic_conditions: bool = False)¶

Compute conditions importances. The importances of a conditions are computed base on:

Marek Sikora: Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation. Fundam. Informaticae 123(2): 171-197 (2013)

https://dblp.org/rec/journals/fuin/Sikora13.html

Parameters

measure (str) – Specifies the measure that is used to evaluate the quality of the rules. Possible measures for classification and regression problem are: C2, Lift, Correlation. Default: C2. It is not possible to select a measure for the survival problem, the LogRank test is used by default
basic_conditions (bool) – Specifies whether to evaluate the conditions contained in the input rules, or to break the conditions in the rules into base conditions so that individual conditions do not overlap

Returns

self – Fitted explainer with calculated conditions

Return type

Explainer

fit_transform(X: DataFrame, selector=None, y=None, POS=None) → DataFrame¶

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters

X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X specified when creating Explainer
selector (string/float) – Specifies on what basis to select the conditions from the rules that will be included as attributes in the transformed set. If None all conditions will be included in the transformed set. If number 0-1 percent of the most important conditions will be selected based on condition importance ranking. If “reduct” the reduct of the conditions set will be selected. Preferably, the option with the percentage of most important conditions will be selected.
y (Union[pd.DataFrame, pd.Series]) – Only if selector = “reduct”.The target values for input sample, used in the determination of the reduct
POS (float) – Only if selector = “reduct”.Target reduct POS

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame

get_rules()¶

Return rules from model

Returns: rules – Rules from model
Return type: List[str]

get_rules_covering_example(x: DataFrame, y: Union[DataFrame, Series]) → List[str]¶

Return rules that covers the given example

Parameters

x (pd.DataFrame) – The input sample.
y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

Returns

rules – Rules that covers the given example

Return type

List[str]

get_rules_with_basic_conditions()¶

Return rules from model with conditions broken down into base conditions so that individual conditions do not overlap

Returns: rules – Rules from the model containing the base conditions
Return type: List[str]

local_explainability(x: DataFrame, y: Union[DataFrame, Series], plot: bool = False)¶

Displays information about the local explanation of the example: the rules that cover the given example and the importance of the conditions contained in these rules

Parameters

x (pd.DataFrame) – The input sample.
y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.
plot (bool) – If True the importance of the conditions will also be shown in the chart. Default: False

plot_importances(importances: DataFrame)¶: Plot importances :param importances: Feature/Condition importances to plot. :type importances: pd.DataFrame

transform(X: DataFrame) → DataFrame¶

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters: X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X given in fit_transform
Returns: X_transformed – Transformed dataset
Return type: pd.DataFrame

class rulexai.explainer.Explainer(X: DataFrame, model_predictions: Union[DataFrame, Series], type: str = 'classification')¶

Parameters

X (pd.DataFrame) – The training dataset used during provided model training
model_predictions (Union[pd.DataFrame, pd.Series]) – The training dataset used during provided model training
type (str) –
The type of problem that the provided model solves. You can choose between:
- ”classification”
- ”regression”
default: “classification”

condition_importances_¶

Computed conditions importances on given dataset

Type: pd.DataFrame

feature_importances_¶

Feature importances computed base on conditions importances

Type: pd.DataFrame

explain(measure: str = 'C2', basic_conditions: bool = False, X_org=None)¶

Compute conditions importances. The importances of a conditions are computed base on:

Marek Sikora: Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation. Fundam. Informaticae 123(2): 171-197 (2013)

https://dblp.org/rec/journals/fuin/Sikora13.html

Parameters

measure (str) – Specifies the measure that is used to evaluate the quality of the rules. Possible measures for classification and regression problem are: C2, Lift, Correlation. Default: C2. It is not possible to select a measure for the survival problem, the LogRank test is used by default
basic_conditions (bool) – Specifies whether to evaluate the conditions contained in the input rules, or to break the conditions in the rules into base conditions so that individual conditions do not overlap
X_org – The dataset on which the rule-based model should be built. It can be the set on which the black-box model was learned or this set before preprocessing (imputation of missing values, dummification, scaling), because such a set can be handled by the rule model

Returns

self – Fitted explainer with calculated conditions

Return type

Explainer

fit_transform(X: DataFrame, selector=None, y=None, POS=None) → DataFrame¶

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters

X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X specified when creating Explainer
selector (string/float) – Specifies on what basis to select the conditions from the rules that will be included as attributes in the transformed set. If None all conditions will be included in the transformed set. If number 0-1 percent of the most important conditions will be selected based on condition importance ranking. If “reduct” the reduct of the conditions set will be selected. Preferably, the option with the percentage of most important conditions will be selected.
y (Union[pd.DataFrame, pd.Series]) – Only if selector = “reduct”.The target values for input sample, used in the determination of the reduct
POS (float) – Only if selector = “reduct”.Target reduct POS

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame

get_rules()¶

Return rules from model

Returns: rules – Rules from model
Return type: List[str]

get_rules_covering_example(x: DataFrame, y: Union[DataFrame, Series]) → List[str]¶

Return rules that covers the given example

Parameters

x (pd.DataFrame) – The input sample.
y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

Returns

rules – Rules that covers the given example

Return type

List[str]

get_rules_with_basic_conditions()¶

Return rules from model with conditions broken down into base conditions so that individual conditions do not overlap

Returns: rules – Rules from the model containing the base conditions
Return type: List[str]

local_explainability(x: DataFrame, y: Union[DataFrame, Series], plot: bool = False)¶

Displays information about the local explanation of the example: the rules that cover the given example and the importance of the conditions contained in these rules

Parameters

x (pd.DataFrame) – The input sample.
y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.
plot (bool) – If True the importance of the conditions will also be shown in the chart. Default: False

plot_importances(importances: DataFrame)¶: Plot importances :param importances: Feature/Condition importances to plot. :type importances: pd.DataFrame

transform(X: DataFrame) → DataFrame¶

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters: X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X given in fit_transform
Returns: X_transformed – Transformed dataset
Return type: pd.DataFrame