Skip to content

[ENH] Rayleigh Distribution with analytical energy calculation#778

Open
KaranSinghDev wants to merge 3 commits intosktime:mainfrom
KaranSinghDev:distributions
Open

[ENH] Rayleigh Distribution with analytical energy calculation#778
KaranSinghDev wants to merge 3 commits intosktime:mainfrom
KaranSinghDev:distributions

Conversation

@KaranSinghDev
Copy link

@KaranSinghDev KaranSinghDev commented Feb 28, 2026

Reference Issues/PRs

#22

What does this implement/fix? Explain your changes.

This PR implements a native, high-performance version of the Rayleigh Distribution (skpro.distributions.Rayleigh).

I have implemented the following analytical methods to bypass Monte Carlo and _approx_derivative fallbacks:

  • _pdf and _log_pdf- $$f(x) = \frac{x}{\sigma^2} e^{-x^2 / (2\sigma^2)}, \quad x \ge 0$$
  • _cdf and _ppf (Inverse CDF) - $$F(x) = 1 - e^{-x^2 / (2\sigma^2)}$$
  • _mean - $$\sigma \sqrt{\frac{\pi}{2}}$$
  • _var - $$\frac{4-\pi}{2} \sigma^2$$
  • _energy_self: $\mathbb{E}[|X - Y|] = \sigma \sqrt{\pi} (\sqrt{2} - 1)$

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

  • Formula Consistency: I used the σ parameterization to align exactly with scipy.stats.
  • Performance: I implemented the exact analytical formula for _energy_self() to avoid the slow Monte Carlo default.
  • Broadcasting: Verified that the implementation handles scalar, 1D, and 2D parameter inputs via the _bc_params broadcasting logic.

Did you add any tests for the change?

I relied on existing distributions suite. I implemented get_test_params with multiple parameter sets (scalar, array, and DataFrame inputs). The Rayleigh class was picked up by the suite, passing 194 automated tests locally.

Any other comments?

I chose this distribution as I am aware of it and I have used it before in signal processing and believe it will be useful in the current list of distributions (even though it is low priority). I have also performed numerical verification against scipy.stats.rayleigh to ensure accuracy in the PDF, CDF, and Mean/Variance calculations.
image

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

@KaranSinghDev
Copy link
Author

I have updated the PR for the workflow failing earlier .

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

For exact energy tag, the cross-term is still missing, _energy_x - do you also have that, or do you leave that for now?

@KaranSinghDev
Copy link
Author

KaranSinghDev commented Mar 2, 2026

@fkiraly Yes I was working on the cross-term to make the energy tag fully exact.

I have added the closed-form analytical solution for _energy_x and the piecewise solution I implemented for $X \sim \text{Rayleigh}(\sigma)$ is:

  • For $x \le 0$: $\mathbb{E}[|X - x|] = \sigma \sqrt{\frac{\pi}{2}} - x$
  • For $x > 0$: $\mathbb{E}[|X - x|] = x + \sigma \sqrt{\frac{\pi}{2}} - \sigma \sqrt{2\pi} \cdot \text{erf}\left(\frac{x}{\sigma \sqrt{2}}\right)$

I verified these formulas against scipy.integrate.quad across both negative and positive x
values. The analytical implementation overlays with the numerical integration image

@KaranSinghDev KaranSinghDev requested a review from fkiraly March 3, 2026 04:43
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I think there does not exist a reference for these formulae, so it would be great if we could display them in docstrings.

For that, this PR (abandoned) needs to be fixed. Maybe if you have time, you could have a look, and then add your explicit formulae?
#698

Alternatively, there is already a markdown where we are collecting these, in the distributions folder, where you can add them.

@KaranSinghDev
Copy link
Author

@fkiraly Sure I will do it .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement module:probability&simulation probability distributions and simulators

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants