So it turns out that there’s a much better rasterization algorithm. After being appalled at how slow my algorithm was (okay it’s not that slow, but it’s pretty much unworkable), I researched better algorithms on the internet. Turns out, the way modern GPU’s do their rasterization is with the Scanline Rasterization Algorithm. Basically what this does is go line by line through the triangle, figure out the leftmost and rightmost points that are within the triangle, and then renders them and all the pixels in between. This is pretty great because it means that for every line that the polygon operates on, we really only have to do two calculations. However, before we can do that, we need to rearrange our triangle first.
The real genius behind this, is that rasterization is dead simple for aligned rectangles. We just figure out four points and render everything in between. It’s also pretty easy for triangle that have a horizontal axis. Luckily, we can turn every triangle into one with a horizontal axis, or rather, two triangles that both have horizontal axis. Visually what we do is take this triangle:
And turn it into these triangles:
At this point, it’s trivially easy to find out the minimum x value for each line, and the maximum x value for each line, and render everything in between. I don’t have any way of getting specific framerate measurements, but what I can do is render something in the background, to measure the difference between the two algorithms.
I put a random pattern that scrolls up the page. There is no framerate limiter, so the faster the background scrolls, the faster each frame must be getting executed. Admittedly eyballing it, but it looks like the program is running at more than an order of magnitude faster, and the code behind it is shorter and simpler to boot.
I ran Visual Studio’s performance profiler on the code. I just rendered the polygon both ways, and it turns out that the RenderPolygon function takes 926ms of CPU time (over the entire program running time), but the RenderPolygonFast takes a grand total of 28ms of CPU time. It’s not just 10x faster, it’s actually 33x faster. Actually, the function I have that just renders the squares in the background actually takes about 10x longer to execute then rendering this polygon, which is very slightly worrying for the future, as the CPU seems to be fairly slow at simply putting pixels on the screen.