Please see the grading tool for details on benchmark instances and time limits.

### IS4: fast CPU solution

Using all resources that you have in the CPU, solve the task **as fast as possible**. You are encouraged to exploit instruction-level parallelism, multithreading, and vector instructions whenever possible, and also to optimize the memory access pattern.

You are expected to use an algorithm that tries out **all possible locations** 0 ≤ y0 < y1 ≤ ny and 0 ≤ x0 < x1 ≤ nx for the rectangle and finds the best one. However, for each candidate location you should only perform O(1) operations to evaluate how good this position is. To achieve this, some preprocessing will be needed.

Please do all arithmetic with **double-precision** floating point numbers.

Deadline week 4. Maximum points 5 + 1, after deadline 3 + 1. Participates in the contest.

### IS6a: fast CPU solution for 1-bit images

In this task, the input is always a monochromatic image: each input pixel is either entirely **white** with RGB values (1,1,1) or entirely **black** with RGB values (0,0,0). Make your solution to IS4 faster by exploiting this property. It is now enough to find a solution for only one color channel, and you will also have much less trouble with rounding errors.

In this task, you are permitted to use **single-precision** floating point numbers.

Deadline week 6. Maximum points 5 + 1. Participates in the contest.

### IS6b: fast GPU solution for 1-bit images

Port your solution to IS6a to the **GPU**; again, make it run **as fast as possible**.

Deadline week 6. Maximum points 5 + 1. Participates in the contest.

### IS9a: better algorithm [challenging]

Design a more efficient algorithm that (at least in typical cases) does not need to try out all possible locations of the rectangle. Implement the algorithm efficiently on the CPU.

Deadline week 6. Maximum points 5.