[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding plot functionality to model_performance #34

Closed
patri01u opened this issue Jul 29, 2018 · 4 comments
Closed

Adding plot functionality to model_performance #34

patri01u opened this issue Jul 29, 2018 · 4 comments
Labels
feature 💡 New feature or enhancement request

Comments

@patri01u
Copy link
patri01u commented Jul 29, 2018

When plotting from model_performance function, would it be possible to add functionality to limit x-axis values, as well as facets by some model factors to try to drill down into specific factors that drive the overall residuals?

Apologies in advance if these functionalities already exists. #Beginnerhere

@pbiecek
Copy link
Member
pbiecek commented Jul 31, 2018

Result from the plot function is an ggplot2 object, thus you can use xlim() ylim() or other function to zoom in some part of the plot.

But I see your point, that for model_performance it would be nice to have annotations for top k residuals. Will add to TODO

@pbiecek pbiecek added the feature 💡 New feature or enhancement request label Jul 31, 2018
pbiecek added a commit that referenced this issue Aug 5, 2018
@pbiecek
Copy link
Member
pbiecek commented Aug 5, 2018

I've added a show_outliers parameter to plot.model_performance(), now you can plot names of points with largest residuals.
See an example here: https://pbiecek.github.io/DALEX/reference/plot.model_performance_explainer.html

@12tafran
Copy link
Contributor
12tafran commented Oct 1, 2018

Would it be more meaningful to have the names of points with largest residuals to correspond to the observed data row index? This will make it easier for users to identify which observation has the worst prediction.

Using https://pbiecek.github.io/DALEX/reference/plot.model_performance_explainer.html as an example. We gave the largest residual in the glm model a name of 100,110. This is very confusing to the users since the validation dataset only have 14,999 row

Would it be possible to add an index column in the model_performace() output so the boxplot can use that index number when identifying largest residual instead of the number now?

@pbiecek
Copy link
Member
pbiecek commented Oct 1, 2018

Makes sense, will add option to select if rownames of row indexes should be presented

12tafran pushed a commit to 12tafran/DALEX that referenced this issue Oct 1, 2018
12tafran pushed a commit to 12tafran/DALEX that referenced this issue Oct 1, 2018
pbiecek added a commit that referenced this issue Oct 7, 2018
Adding plot functionality to model_performance #34
@pbiecek pbiecek closed this as completed Dec 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature 💡 New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

3 participants