Pashmina’s Blog
Github repo that contains accompanying code is at optimization_examples.
Fast code with just enough effort pdf
, UK Research Software Engineering Conference 2018
Further analysis can be found at:
Additional information on hardware accelerators may be found at:
In future, we may look at
effect of cache sizes and cache misses
ARM assembly starter examples
AVX-512 assembly starter examples
comparison with Julia