Several factors here to give you a specific answer as well as how to how to interpret your question. Do you want to store the result 20 times or over write the result into the same memory locations. This simple interpretation changes how much memory will be required for each multiplication. You need to specify more details specific to a processor type and its clock speed ut in general you can do a quick estimate
Since you want to do 8 bit integer math (1 Byte = 8 bits) this means that you will be multiplying values from 0 to 255 together. To store the answer will require 2 bytes since 255 x 255 = 65025 which requires 2 bytes to store.
So each multiplication needs 4 bytes of memory at a minimum to store the two (1 byte) multiplier values and the (2 byte) answer.
1Mb contains 1048576 bytes (2^20)
divide this by 4 bytes and you get 262144, the number of multiplication and answer sequences.
Next consideration is the number of clock cycles requires to execute each CPU instruction. This can vary from 1 to 4 cycles based upon the type of instruction. Any type of branch or memory transfer function requires more clock cycles than say to trigger a math function in the CPU
So you have to transfer 4 bytes of data to/from the memory so assume 16 clock cycles (4 per byte). There will be the need to handle pointers for memory access so there is at least one addition per byte so there is another 4 instruction cycles.
So far that is 20 instruction cycles per multiplication sequence
There are more delays to consider in memory transfer such as latency but this will give you a general best possible time to consider. Actual time will be longer as you consider additional instruction cycles and other delays that may be taken into account.
total number of cpu clock cycles is 20 * 262144 = 5242880
and you want to do this 20 times
20 * 5242880 = 104857600
Assuming a 4 Ghz clock (4,000,000,000 Hz)
104857600 / 4000000000 = 0.0262144 seconds
or 26.2 milliseconds
Faster than the blink of an eye.