Module saePisan.service.exploration.NormalityTest
Functions
def run_normality_test(parent)-
Expand source code
def run_normality_test(parent): """ Runs a normality test on the combined data from two models in the parent object. This function performs the following steps: 1. Activates the R environment. 2. Retrieves data from two models in the parent object and concatenates them horizontally. 3. Drops any rows with null values. 4. Converts the concatenated data to an R dataframe. 5. Executes an R script provided by the parent object to perform normality tests. 6. Collects the results of the normality tests and any generated plots. Parameters: parent (object): An object that contains the following attributes: - activate_R(): A method to activate the R environment. - model1: An object with a get_data() method that returns a dataframe. - model2: An object with a get_data() method that returns a dataframe. - r_script (str): An R script to be executed. - selected_columns (list): A list of column names to be tested for normality. - result (str): A string to store the results of the normality tests. - plot (list): A list to store the paths of generated plots. - error (bool): A boolean to indicate if an error occurred. Raises: Exception: If any error occurs during the execution of the R script or data processing. """ parent.activate_R() df1 = parent.model1.get_data() df2 = parent.model2.get_data() df = pl.concat([df1, df2], how="horizontal") df = df.filter(~pl.all_horizontal(pl.all().is_null())) get_data(parent,df) try: ro.r('rm(list=ls()[ls() != "r_df"])') ro.r('suppressMessages(library(nortest))') ro.r('suppressMessages(library(tseries))') ro.r('suppressMessages(library(ggplot2))') ro.r('data <- as.data.frame(r_df)') script = parent.r_script ro.r(script) selected_vars = parent.selected_columns plot_paths = [] test_names = ["shapiro", "jarque", "lilliefors"] # Prepare data containers shapiro_data = [] jarque_data = [] lillie_data = [] test_map = { "shapiro": shapiro_data, "jarque": jarque_data, "lilliefors": lillie_data } for var in selected_vars: safe_var = var.replace(" ", "_") for test in test_names: result_key = f"normality_results_{safe_var}_{test}" if ro.r(f"exists('{result_key}')")[0]: stat = ro.r(f"{result_key}$statistic")[0] pval = ro.r(f"{result_key}$p.value")[0] test_map[test].append({ "Variable Name": var, "Statistic": stat, "P-Value": pval }) # Save plots if available for plot_type in ["histogram", "qqplot"]: plot_name = f"{plot_type}_{safe_var}" if ro.r(f"exists('{plot_name}')")[0]: plot_path = f"temp_{plot_name}.png" grdevices.png(file=plot_path, width=800, height=600) ro.r(f"print({plot_name})") grdevices.dev_off() plot_paths.append(plot_path) # Build parent.result only with non-empty tables result_dict = {} if shapiro_data: result_dict["Shapiro Test Results"] = pl.DataFrame(shapiro_data) if jarque_data: result_dict["Jarque Test Results"] = pl.DataFrame(jarque_data) if lillie_data: result_dict["Lilliefors Test Results"] = pl.DataFrame(lillie_data) parent.result = result_dict parent.plot = plot_paths if plot_paths else None except Exception as e: parent.result = str(e) parent.error = TrueRuns a normality test on the combined data from two models in the parent object. This function performs the following steps: 1. Activates the R environment. 2. Retrieves data from two models in the parent object and concatenates them horizontally. 3. Drops any rows with null values. 4. Converts the concatenated data to an R dataframe. 5. Executes an R script provided by the parent object to perform normality tests. 6. Collects the results of the normality tests and any generated plots. Parameters: parent (object): An object that contains the following attributes: - activate_R(): A method to activate the R environment. - model1: An object with a get_data() method that returns a dataframe. - model2: An object with a get_data() method that returns a dataframe. - r_script (str): An R script to be executed. - selected_columns (list): A list of column names to be tested for normality. - result (str): A string to store the results of the normality tests. - plot (list): A list to store the paths of generated plots. - error (bool): A boolean to indicate if an error occurred. Raises: Exception: If any error occurs during the execution of the R script or data processing.