How did we hardware-parallelise 3D (depth-buffered) graphics before we could stuff that hardware full of compute units?
How we assembled the different steps from multiply-adders is fairly straightforward, I'll leave that to your imagination. Though some early chips like in the PlayStation didn't include all the steps, as a cost-cutting measure.
The image data is split into tiles (who's pixels are ordered a space-filling curve) to simplify its processing.
1/?