[clblas] 14/61: replacing barrier with memfence in the inner most loop requires an extra barrier at the beginning of the outer loop.
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Fri Jul 24 22:49:44 UTC 2015
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a commit to branch master
in repository clblas.
commit fda48a7028411d4f6db3102b9aba85fb4b5e68f5
Author: Timmy <timmy.liu at amd.com>
Date: Sun Apr 19 03:03:28 2015 -0500
replacing barrier with memfence in the inner most loop requires an extra barrier at the beginning of the outer loop.
---
src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl b/src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl
index 73d9dc3..9999282 100644
--- a/src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl
+++ b/src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl
@@ -284,7 +284,8 @@ __kernel void sgemm_NT_32_32_16_16x16_2x2__ALPHABETA_BRANCH( __global float cons
{
__local float* plA = lA + idy*33+idx;
__local float* plB = lB + idy*33+idx;
-
+ barrier(CLK_LOCAL_MEM_FENCE);
+
plB[0] = CurrentOffSetB>=N?0.0:B[0];
plB[16] = CurrentOffSetB+16>=N?0.0:B[16];
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/clblas.git
More information about the debian-science-commits
mailing list