Machine Learning Output to Create Shap Waterfall Plots

I have an output from an ML model. It gives me an ID and a Prediction for the ID as well as all the feature names and values. I also get an Analysis json that has the global shap values, expected shap value, and pdp with data distribution and model predictions. What is the best way to normalize global shap values to local shap values? I do not have access to the model only the output.

Hi,

what you can do is first determine the format that you are getting all of those values (which from your post, you have already done). However, we don’t know the format, we just know from your post that you can obtain said values from an ML model - is the model via a Python function? Does it have to be converted to Python code?

For simplicity, let’s assume that it is either in Python already or has been converted to Python. Additionally, making the assumption that you are receiving that information via an array of some type all in one shot (you stated ‘an output’ which implies singular and not plural), and that that information is provided by way of a function. If so, we can assign it to a local variable. From there, after determining which array index provides what, we can then begin assigning the ML model values to local variables for local access.

Is this what you are asking?

Here is a simple example - may not be totally correct but the general idea (I hope):

def ml_model_values():
    """
    Assume these are the values being returned by the ML model of which 
    we have no additional information.  We also assume that you have imported
    this function from an external module. 
    """
     # Miscellaneous ml model values
    ID_num = 10
    pred_ID = 20
    feat_name1 = 'Sam'
    feat_name2 = 'John'
    feat_val1 = 30
    feat_val2 = 40
    shap1 = 'shap value 1'
    shap2 = 'shap value 2'

    info_array = []

    info_array.append(ID_num)
    info_array.append(pred_ID)
    info_array.append(feat_name1)
    info_array.append(feat_name2)
    info_array.append(feat_val1)
    info_array.append(feat_val2)
    info_array.append(shap1)
    info_array.append(shap2)

    return info_array

local_var = ml_model_values()
print('Read array returned from ml model: \n', local_var, sep='')  # Print just to verify info that was received

# Begin assigning to local values (only some - not all for brevity).
ID_num = local_var[0]
pre_ID = local_var[1]
feat_name1 = local_var[2]
feat_name2 = local_var[3]
feat_val1 = local_var[4]
feat_val2 = local_var[5]

# Verification is all
print('\nVerify ML model values are available locally.')
print('ID_num :    ', ID_num)
print('pre_ID :    ', pre_ID)
print('feat_name1: ', feat_name1)
print('feat_name2: ',  feat_name2)
print('feat_val1:  ', feat_val1)
print('feat_val2:  ', feat_val2)

I do not have access to the model code. I get it from another source. Just looking for a way to normalize global shap values to local shap values for each individual in my output.