GitHub - Resende-Lab/EnvKernel: EnvKernel repo

Package EnvKernel: Fast Rcpp Kernels

This is a repo for sharing the R package EnvKernel. It was created when we were looking to implement several non-linear kernels for studying environmental covariate-based relationship matrices and could not find a good package that would do so. It was implemented in Rcpp (Eddelbuettel & Balamuta, 2018) to speed up the process. Feel free to contact me and point out any issues for improvement.

Marco Peixoto and João Paulo Gusmão

Install it using this token

library(devtools)
devtools::install_github(repo = "Resende-Lab/EnvKernel")

Citation

If you use EnvKernel, please cite:

Peixoto, M., & Gusmão, J. P. (2026). EnvKernel: Environmental Kernel Methods in R. https://github.com/Resende-Lab/EnvKernel

Vignette

Example

# Load data and the package
library(EnvKernel)
data("envMarks")

# Pick the kernel and generate the matrix using one of the following methods:
meth = c("ETK","ELK","EEK", "EGK", "EDK")

EnvCov <- EnvKernel::getKernel(envMarks, scale = TRUE, method=meth[1])

# Matrix
EnvCov[1:5,1:5]

# Plot
heatmap(EnvCov)

Kernel Equation Implementations

There are five implementations in the package so far. All kernels operate on a scaled data matrix
$W ∈ ℝ^{n × p}$ (rows = observations/environments, columns = variables/environmental covariates/markers). I recommend using the argument scale for that, available in the main function getKernel().

1. ETK: Transposed (Gram) Kernel

$$ K =\ \frac{WW^T}{p} $$

Normalized by the number of columns $p$.

2. ELK: Linear Kernel (Jarquín et al., 2014; Sorensen et al., 2012)

$$ K_{ij} =\ \frac{(WW^T)_{ij}} {\displaystyle \frac{1}{n}\sum_{k=1}^{n}(WW^T)_{kk}} $$

$WW^T$ is the Gram matrix (inner products of rows).
Normalized by the mean trace.
$n$ is the number of rows in the matrix $W$.

3. EGK: Gaussian / Radial Basis Function Kernel (Schölkopf & Smola, 2002)

$$ K_{ij} = \exp\Bigl( -\frac{|x_i - x_j|^2}{2\sigma^2} \Bigr) $$

$x_i$, $x_j$ are the $i$-th and $j$-th rows of $W$.
$\sigma$ is the positive bandwidth parameter.

4. EEK: Exponential / Laplacian Kernel (Schölkopf & Smola, 2002; Genton, 2001)

$$ K_{ij} = \exp\Bigl( -\frac{\phi \cdot |x_i - x_j|}{\bar{D}} \Bigr) $$

$x_i$ and $x_j$ are the $i$-th and $j$-th rows of $W$.
$|x_i - x_j|$ is the Euclidean distance (not squared).
$\bar{D} = \frac{1}{n(n-1)}\sum_{k \neq l} |x_k - x_l|$ is the mean pairwise distance.
$\phi$ is the positive bandwidth scaling parameter (default = 1.0).

5. EDK: Arc-Cosine (Deep) Kernel – First Order (Cho & Saul, 2009)

$$ K_{ij} = \frac{|x_i||x_j|}{\pi} \Bigl[ \sin\theta_{ij} +\bigl(\pi-\theta_{ij}\bigr)\cos\theta_{ij} \Bigr] $$ where $$ \theta_{ij} = \cos^{-1}!\Bigl( \frac{x_i^{\top}x_j}{|x_i||x_j|} \Bigr) $$

Mimics one hidden layer of ReLU units in a neural network.

Notation Summary

Symbol	Meaning
$W$	Data matrix ($n \times p$)
$n$	Number of rows (observations/environments)
$p$	Number of columns (variables)
$x_i$	$i$-th row vector of $W$
$\sigma$	Bandwidth/hyperparameter
$\phi$	Bandwidth scaling parameter
$\Sigma$	Empirical covariance matrix
$\bar{D}$	Mean pairwise distance

References

Cho, Y., & Saul, L. K. (2009). Kernel methods for deep learning. Advances in Neural Information Processing Systems, 22.
Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4). New York: springer.
Eddelbuettel, D., & Balamuta, J. (2018). Extending R with C++: A Brief Introduction to Rcpp. The American Statistician, 72(1), 28-36. https://doi.org/10.1080/00031305.2017.1375990
Genton, M. G. (2001). Classes of kernels for machine learning: A statistics perspective. Journal of Machine Learning Research, 2, 299-312.
Jarquín, D., Crossa, J., Lacaze, X., Du Cheyron, P., Daucourt, J., Lorgeou, J., ... & de los Campos, G. (2014). A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theoretical and Applied Genetics, 127, 595-607.
Schölkopf, B., & Smola, A. J. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
Sorensen, D., Fernando, R., & Gianola, D. (2001). Inferring the trajectory of genetic variance in the course of artificial selection. Genetical Research, 77(1), 83-94.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.vscode		.vscode
R		R
data		data
man		man
src		src
tests		tests
.gitattributes		.gitattributes
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
icon.png		icon.png
scholkopf2002learning_with_kernels.pdf		scholkopf2002learning_with_kernels.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Package EnvKernel: Fast Rcpp Kernels

Install it using this token

Citation

Vignette

Example

Kernel Equation Implementations

1. ETK: Transposed (Gram) Kernel

2. ELK: Linear Kernel (Jarquín et al., 2014; Sorensen et al., 2012)

3. EGK: Gaussian / Radial Basis Function Kernel (Schölkopf & Smola, 2002)

4. EEK: Exponential / Laplacian Kernel (Schölkopf & Smola, 2002; Genton, 2001)

5. EDK: Arc-Cosine (Deep) Kernel – First Order (Cho & Saul, 2009)

Notation Summary

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Package EnvKernel: Fast Rcpp Kernels

Install it using this token

Citation

Vignette

Example

Kernel Equation Implementations

1. ETK: Transposed (Gram) Kernel

2. ELK: Linear Kernel (Jarquín et al., 2014; Sorensen et al., 2012)

3. EGK: Gaussian / Radial Basis Function Kernel (Schölkopf & Smola, 2002)

4. EEK: Exponential / Laplacian Kernel (Schölkopf & Smola, 2002; Genton, 2001)

5. EDK: Arc-Cosine (Deep) Kernel – First Order (Cho & Saul, 2009)

Notation Summary

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages