Hello,I can not understand one sentence in your paper,'When the training error keeps unchanged in five sequential epochs, we merge the parameters of each batch normalization into the adjacent convolution filters'.I try to figure it out by reading your code but fail to deal with matlab...Thanks!