Material
Exercises
Programming Parallel Computers
Intro
Chapter 1
2
3
4
Lecture 1
2
3
4
5
6
Links
About
Index
Links to external resources
Hardware
Agner Fog:
Instruction tables
— the instruction latencies and throughputs for Intel CPUs
Intel Skylake microarchitecture
in WikiChip
OpenMP
OpenMP specification
SIMD
GCC vector extensions
Clang vector extensions
posix_memalign
— allocating memory with a specific alignment
CUDA
NVIDIA:
CUDA documentation
OpenCL
[optional]
Khronos:
OpenCL specification
Rust programming language
[optional]
Matias Lindgren:
Rust and C++, a performance comparison
— Rust implementations of the examples from
Chapter 2
Low-level programming techniques
[advanced]
Intel Intrinsics Guide
GCC: x86 built-in functions
GCC: other built-in functions
Sean Eron Anderson:
Bit Twiddling Hacks
More about this course
PPC statistics