I rebuilt almost the entire engine. It is much cleaner, less code, better organized. No copy-paste code. The same routine handles all chunk generation and scheduling. Multithreading works a lot better now - rather than redundantly processing chunk edges, it synchronizes the chunk processing. Noncoindentially, I seemed to have completely eliminated the crashes I encountered every so often. (Word to the wise: do not try and use C++ vectors inside a thread, even if each one is exclusive to its own thread.) I also implemented better mipmap generation and handling. Overall, rendering performance is up 2-10 times faster depending on settings, and chunk generation is (as promised!) over 100 times faster, although there is still much room for improvement (particularly in terms of choosing visible chunks to render). But for now, I am very happy with performance and probably won't optimize it much more for the time being.
I wrote my own GUI and text-render from the ground up. It can render without shader state changes, meaning it can all be compiled into a single display list. It is generated almost entirely on the GPU (text is read from a bitmap though). It uses uniform spacing and character widths, intentionally (all part of the old-school look :D). Still in the early stages but here is a screen shot (characters have backgrounds just for the hell of it, but can easily be disabled).
Overall, I am very happy with the progress made, the engine feels much cleaner, faster, and stable.