Documentation for AnalysisBox
This module is inherited from FeaturesSet class and allows for preliminary statistical analysis of the numerical features.
calculate_basic_stats(self, volume_feature='')
Calculate basic statistical scores (such as: number of missing values, mean, std, min, max, Mann-Whitney test p-values for binary classes, univariate ROC AUC for binary classes, Spearman's correlation with volume if volumetric feature name is sent to function, Shapiro-Wilk test p-values) for each feature and save it to .csv file.
| Parameters: |
|
|---|
handle_constant(self)
Drop the features with the constant values.
handle_nan(self, axis=1, how='any', mode='delete')
Handle the missing values.
| Parameters: |
|
|---|
normality_check(self, features_to_plot=[], p_thresh=0.05)
Perform Shapiro-Wilcoxon normality check for all the features.
| Parameters: |
|
|---|
| Returns: |
|
|---|
plot_MW_p(self, features_to_plot=[], binary_classes_to_plot=[], p_threshold=0.05)
Plot two-sided Mann-Whitney U test p-values for comparison of features values means in 2 classes (with correction for multiple testing) into interactive .html report.
| Parameters: |
|
|---|
plot_correlation_matrix(self, features_to_plot=[], corr_method='spearman', save_to_csv=False)
Plot correlation (Spearman's by default) matrix for the features into interactive .html report.
| Parameters: |
|
|---|
plot_distribution(self, features_to_plot=[], binary_classes_to_plot=[])
Plot distribution of the feature values in classes into interactive .html report.
| Parameters: |
|
|---|
plot_univariate_roc(self, features_to_plot=[], binary_classes_to_plot=[], auc_threshold=0.75)
Plot univariate ROC curves (with AUC calculation) for threshold binary classifier, based of each feature separately into interactive .html report.
| Parameters: |
|
|---|
volume_analysis(self, volume_feature='', auc_threshold=0.75, features_to_plot=[], corr_threshold=0.75)
Calculate features correlation (Spearman’s) with volume and plot volume-based precision-recall curve.
| Parameters: |
|
|---|