Code documentation

class rulexai.explainer.RuleExplainer(model, X: DataFrame, y: Union[DataFrame, Series], type: str = 'classification')
Parameters
  • model (Model = Union[RuleClassifier, RuleRegressor, SurvivalRules, CN2UnorderedClassifier, CN2SDUnorderedClassifier, DecisionTreeClassifier, DecisionTreeRegressor, SurvivalTree, List[str]]) –

    Model to be analyzed. RuleXai supports the following Rule models:
    It can also extract rules from decision trees:
    Or you can provide a list of rules as:
    • classification:

      IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN label_atrribute = {class_name}

    • regression:

      IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN target_attribute = {value}

    • survival:

      IF attribute1 = (-inf, value) AND … AND attribute2 = <value1, value2) THEN survival_status_attribute = {survival_status}

  • X (pd.DataFrame) – The training dataset used during provided model training

  • y (Union[pd.DataFrame, pd.Series]) – The target values (class labels, real number, survival status) used during provided model training

  • type (str = None) –

    The type of problem that the provided model solves. You can choose between:
    • ”classification”

    • ”regression”

    • ”survival”

    default: “classification”

condition_importances_

Computed conditions importances

Type

pd.DataFrame

feature_importances_

Feature importances computed base on conditions importances

Type

pd.DataFrame

explain(measure: str = 'C2', basic_conditions: bool = False)

Compute conditions importances. The importances of a conditions are computed base on:

Marek Sikora: Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation. Fundam. Informaticae 123(2): 171-197 (2013)

https://dblp.org/rec/journals/fuin/Sikora13.html

Parameters
  • measure (str) – Specifies the measure that is used to evaluate the quality of the rules. Possible measures for classification and regression problem are: C2, Lift, Correlation. Default: C2. It is not possible to select a measure for the survival problem, the LogRank test is used by default

  • basic_conditions (bool) – Specifies whether to evaluate the conditions contained in the input rules, or to break the conditions in the rules into base conditions so that individual conditions do not overlap

Returns

self – Fitted explainer with calculated conditions

Return type

Explainer

fit_transform(X: DataFrame, selector=None, y=None, POS=None) DataFrame

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters
  • X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X specified when creating Explainer

  • selector (string/float) – Specifies on what basis to select the conditions from the rules that will be included as attributes in the transformed set. If None all conditions will be included in the transformed set. If number 0-1 percent of the most important conditions will be selected based on condition importance ranking. If “reduct” the reduct of the conditions set will be selected. Preferably, the option with the percentage of most important conditions will be selected.

  • y (Union[pd.DataFrame, pd.Series]) – Only if selector = “reduct”.The target values for input sample, used in the determination of the reduct

  • POS (float) – Only if selector = “reduct”.Target reduct POS

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame

get_rules()

Return rules from model

Returns

rules – Rules from model

Return type

List[str]

get_rules_covering_example(x: DataFrame, y: Union[DataFrame, Series]) List[str]

Return rules that covers the given example

Parameters
  • x (pd.DataFrame) – The input sample.

  • y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

Returns

rules – Rules that covers the given example

Return type

List[str]

get_rules_with_basic_conditions()

Return rules from model with conditions broken down into base conditions so that individual conditions do not overlap

Returns

rules – Rules from the model containing the base conditions

Return type

List[str]

local_explainability(x: DataFrame, y: Union[DataFrame, Series], plot: bool = False)

Displays information about the local explanation of the example: the rules that cover the given example and the importance of the conditions contained in these rules

Parameters
  • x (pd.DataFrame) – The input sample.

  • y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

  • plot (bool) – If True the importance of the conditions will also be shown in the chart. Default: False

plot_importances(importances: DataFrame)

Plot importances :param importances: Feature/Condition importances to plot. :type importances: pd.DataFrame

transform(X: DataFrame) DataFrame

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters

X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X given in fit_transform

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame

class rulexai.explainer.Explainer(X: DataFrame, model_predictions: Union[DataFrame, Series], type: str = 'classification')
Parameters
  • X (pd.DataFrame) – The training dataset used during provided model training

  • model_predictions (Union[pd.DataFrame, pd.Series]) – The training dataset used during provided model training

  • type (str) –

    The type of problem that the provided model solves. You can choose between:
    • ”classification”

    • ”regression”

    default: “classification”

condition_importances_

Computed conditions importances on given dataset

Type

pd.DataFrame

feature_importances_

Feature importances computed base on conditions importances

Type

pd.DataFrame

explain(measure: str = 'C2', basic_conditions: bool = False, X_org=None)

Compute conditions importances. The importances of a conditions are computed base on:

Marek Sikora: Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation. Fundam. Informaticae 123(2): 171-197 (2013)

https://dblp.org/rec/journals/fuin/Sikora13.html

Parameters
  • measure (str) – Specifies the measure that is used to evaluate the quality of the rules. Possible measures for classification and regression problem are: C2, Lift, Correlation. Default: C2. It is not possible to select a measure for the survival problem, the LogRank test is used by default

  • basic_conditions (bool) – Specifies whether to evaluate the conditions contained in the input rules, or to break the conditions in the rules into base conditions so that individual conditions do not overlap

  • X_org – The dataset on which the rule-based model should be built. It can be the set on which the black-box model was learned or this set before preprocessing (imputation of missing values, dummification, scaling), because such a set can be handled by the rule model

Returns

self – Fitted explainer with calculated conditions

Return type

Explainer

fit_transform(X: DataFrame, selector=None, y=None, POS=None) DataFrame

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters
  • X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X specified when creating Explainer

  • selector (string/float) – Specifies on what basis to select the conditions from the rules that will be included as attributes in the transformed set. If None all conditions will be included in the transformed set. If number 0-1 percent of the most important conditions will be selected based on condition importance ranking. If “reduct” the reduct of the conditions set will be selected. Preferably, the option with the percentage of most important conditions will be selected.

  • y (Union[pd.DataFrame, pd.Series]) – Only if selector = “reduct”.The target values for input sample, used in the determination of the reduct

  • POS (float) – Only if selector = “reduct”.Target reduct POS

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame

get_rules()

Return rules from model

Returns

rules – Rules from model

Return type

List[str]

get_rules_covering_example(x: DataFrame, y: Union[DataFrame, Series]) List[str]

Return rules that covers the given example

Parameters
  • x (pd.DataFrame) – The input sample.

  • y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

Returns

rules – Rules that covers the given example

Return type

List[str]

get_rules_with_basic_conditions()

Return rules from model with conditions broken down into base conditions so that individual conditions do not overlap

Returns

rules – Rules from the model containing the base conditions

Return type

List[str]

local_explainability(x: DataFrame, y: Union[DataFrame, Series], plot: bool = False)

Displays information about the local explanation of the example: the rules that cover the given example and the importance of the conditions contained in these rules

Parameters
  • x (pd.DataFrame) – The input sample.

  • y (Union[pd.DataFrame, pd.Series]) – The target values for input sample.

  • plot (bool) – If True the importance of the conditions will also be shown in the chart. Default: False

plot_importances(importances: DataFrame)

Plot importances :param importances: Feature/Condition importances to plot. :type importances: pd.DataFrame

transform(X: DataFrame) DataFrame

Creates a dataset based on given dataset in which the examples, instead of being described by the original attributes, will be described with the specified conditions - it will be a set with binary attributes determining whether a given example meets a given condition. It can be considered as kind of dummification. Thanks to this function you can discretize data and get rid of missing values. It can be used as prestep for others algorithms.

Parameters

X (pd.DataFrame) – The input samples from which you want to create binary dataset. Should have the same columns and columns order as X given in fit_transform

Returns

X_transformed – Transformed dataset

Return type

pd.DataFrame