add MAPE to regression metrics (fixes #691)#822
add MAPE to regression metrics (fixes #691)#822TomAugspurger merged 4 commits intodask:mainfrom jameslamb:feat/mape
Conversation
tests/metrics/test_regression.py
Outdated
|
|
||
|
|
||
| @pytest.fixture(params=["mean_squared_error", "mean_absolute_error", "r2_score"]) | ||
| @pytest.fixture(params=["mean_squared_error", "mean_absolute_error", "mean_absolute_percentage_error", "r2_score"]) |
There was a problem hiding this comment.
| @pytest.fixture(params=["mean_squared_error", "mean_absolute_error", "mean_absolute_percentage_error", "r2_score"]) | |
| @pytest.fixture( | |
| params=[ | |
| "mean_squared_error", | |
| "mean_absolute_error", | |
| "mean_absolute_percentage_error", | |
| "r2_score", | |
| ] | |
| ) |
Looks like black==19.10b0 isn't happy about the line length here.
There was a problem hiding this comment.
Also, perhaps, it'd be nice if the correctness of the method was sanity-checked against its sklearn counterpart, just as it's done for some of the other metrics a bit further down in the same test file.
There was a problem hiding this comment.
Looks like black==19.10b0 isn't happy about the line length here.
Ah ok. If dask-ml has chosen to pin to older versions of linters, then I think the non-conda option documented at https://ml.dask.org/contributing.html#style will be unreliable, since
Line 29 in f5e5bb4
black.
Once i switched to the conda instructions there, I got the expected diff. Updated in 1142fcc.
Also, perhaps, it'd be nice if the correctness of the method was sanity-checked against its sklearn counterpart, just as it's done for some of the other metrics a bit further down in the same test file.
Can you clarify what you want me to change? As far as I can tell, that is exactly what happens by adding mean_squared_percentage_error to the metric_pairs fixture. Every metric in that fixture is tested against its scikit-learn equivalent by
dask-ml/tests/metrics/test_regression.py
Line 37 in f5e5bb4
There was a problem hiding this comment.
Ah ok. If dask-ml has chosen to pin to older versions of linters, then I think the non-conda option documented at https://ml.dask.org/contributing.html#style will be unreliable
You're absolutely right! I've got a PR over at #813 waiting to be reviewed (for a couple of weeks now), and subsequently merged. It should improve the static-checking situation.
Every metric in that fixture is tested against its scikit-learn equivalent by
Indeed - ignore me about this one, please! I got confused that we should probably further introduce extra tests like the test_mean_squared_log_error one.
There was a problem hiding this comment.
I'll bring up the question of whether the setup.py versions of the linters should be pinned, too, in #813.
|
I'm grateful for the I just pushed b05213b to attempt to address it. Basically, the use of
|
|
Looks good, thanks! |
This PR proposes adding
mean_absolute_percentage_error()("MAPE"), as originally suggested in #691.It follows the implementation from
scikit-learn(https://github.com/scikit-learn/scikit-learn/blob/9cfacf1540a991461b91617c779c69753a1ee4c0/sklearn/metrics/_regression.py#L280), including the use ofnp.finfo(np.float64).epsin the denominator to prevent divide-by-0 errors.Notes for reviewers
This PR adds a bit of test coverage by adding
mean_absolute_percentage_error()to themetric_pairsfixture in tests. It would automatically get more specific coverage (like for combinations ofmultioutputandcompute) if #820 is accepted.Thanks for your time and consideration.