This is a repo for sharing the R package EnvKernel. It was created when we were looking to implement several non-linear kernels for studying environmental covariate-based relationship matrices and could not find a good package that would do so. It was implemented in Rcpp (Eddelbuettel & Balamuta, 2018) to speed up the process. Feel free to contact me and point out any issues for improvement.
Marco Peixoto and João Paulo Gusmão
library(devtools)
devtools::install_github(repo = "Resende-Lab/EnvKernel")
If you use EnvKernel, please cite:
Peixoto, M., & Gusmão, J. P. (2026). EnvKernel: Environmental Kernel Methods in R. https://github.com/Resende-Lab/EnvKernel
# Load data and the package
library(EnvKernel)
data("envMarks")
# Pick the kernel and generate the matrix using one of the following methods:
meth = c("ETK","ELK","EEK", "EGK", "EDK")
EnvCov <- EnvKernel::getKernel(envMarks, scale = TRUE, method=meth[1])
# Matrix
EnvCov[1:5,1:5]
# Plot
heatmap(EnvCov)
There are five implementations in the package so far. All kernels operate on a scaled data matrix
- Normalized by the number of columns
$p$ .
-
$WW^T$ is the Gram matrix (inner products of rows). - Normalized by the mean trace.
-
$n$ is the number of rows in the matrix$W$ .
-
$x_i$ ,$x_j$ are the$i$ -th and$j$ -th rows of$W$ . -
$\sigma$ is the positive bandwidth parameter.
-
$x_i$ and$x_j$ are the$i$ -th and$j$ -th rows of$W$ . -
$|x_i - x_j|$ is the Euclidean distance (not squared). -
$\bar{D} = \frac{1}{n(n-1)}\sum_{k \neq l} |x_k - x_l|$ is the mean pairwise distance. -
$\phi$ is the positive bandwidth scaling parameter (default = 1.0).
- Mimics one hidden layer of ReLU units in a neural network.
| Symbol | Meaning |
|---|---|
| Data matrix ( |
|
| Number of rows (observations/environments) | |
| Number of columns (variables) | |
|
|
|
| Bandwidth/hyperparameter | |
| Bandwidth scaling parameter | |
| Empirical covariance matrix | |
| Mean pairwise distance |
- Cho, Y., & Saul, L. K. (2009). Kernel methods for deep learning. Advances in Neural Information Processing Systems, 22.
- Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4). New York: springer.
- Eddelbuettel, D., & Balamuta, J. (2018). Extending R with C++: A Brief Introduction to Rcpp. The American Statistician, 72(1), 28-36. https://doi.org/10.1080/00031305.2017.1375990
- Genton, M. G. (2001). Classes of kernels for machine learning: A statistics perspective. Journal of Machine Learning Research, 2, 299-312.
- Jarquín, D., Crossa, J., Lacaze, X., Du Cheyron, P., Daucourt, J., Lorgeou, J., ... & de los Campos, G. (2014). A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theoretical and Applied Genetics, 127, 595-607.
- Schölkopf, B., & Smola, A. J. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
- Sorensen, D., Fernando, R., & Gianola, D. (2001). Inferring the trajectory of genetic variance in the course of artificial selection. Genetical Research, 77(1), 83-94.