Make sure that you use a fast selection algorithm to find the median. Do not sort; quickselect and other good selection algorithms are much faster than sorting, and there is already a good implementation in the C++ standard library, so you do not need to reinvent it.
Hints for MF9a
Here are some examples of possible techniques that you can use to solve task MF9a:
- Partition the image in smaller, partially overlapping blocks. For example, if your windows size is 21 × 21, a reasonable block size might be 50 × 50. Solve the median filter problem separately for each block; place the blocks so that each output pixel comes from exactly one block. Be careful with the boundaries.
- Within each block, sort the input data and replace the original values by their ordinals, breaking ties arbitrarily — for example, if the input data was (1.0, 1.0, 0.1, 0.9, 0.5, 0.2, 1.0), replace it with (4, 5, 0, 3, 2, 1, 6). Now you have a much simpler task: instead of finding a median of floating point numbers, you only need to find the median of small, distinct integers.
- You can now represent the sliding window as a bit vector: for example, if your block size is 50 × 50, then your sliding window can be represented as a bit vector with 2500 bits. Do not recompute the bit vector from scratch for every window position — if you move the window from (x,y) to (x+1,y), you only need to reset 2k+1 bits and set 2k+1 bits.
- Exploit bit-parallelism and the special CPU instructions that help with bit manipulation. To find the median from the bit vector efficiently, you may want to have a look at e.g. compiler intrinsics __builtin_popcountll (POPCNT instruction), __builtin_ctzll (TZCNT instruction), and _pdep_u64 (PDEP instruction). Some caching of pre-computed values may be helpful, too.
- Pay attention to the locality of memory references. For example, does it make more sense to slide the window from left to right or from top to bottom?
- You can use OpenMP to process multiple blocks in parallel.