Take

Now slightly less dumb

One of the lower hanging fruit in the CMS powering this site was that, for each edit or new post or deleted post, it would reload the entire thing from the database. (To keep load times at a minimum, it keeps the HTMLized state of all the posts entirely in memory, essentially frying it, but entirely from the working set on every page load.)

I wanted to reload only the changing parts, but between having two separate "tracks" of "all posts in chronological order" depending on whether you only see the published posts or you're me and you see everything, it was enough complications to make me skip it before. But tonight, I tackled this and quickly realized that by limiting the changes to the only changes that really happen, which is one post being created, changed or deleted, all that was needed was to make a copy of the list of all post information, find its next/prev neighbors for each of the all/only-published combinations, reload them together with the post itself, insert this reloaded information into the list copy and create a new instance of the everything-about-all-posts object (which does all the lookups and even has the pre-baked JSON and Atom feeds) with the new list.

It now occurs to me that this way of doing it is in fact a bit more dumb because it introduces a flaw. If a post's published date is changed such that it is ordered differently, it will have a different set of neighbors before and after being saved. Reloading everything from scratch each time avoided this corner case. I suppose it's back to the drawing board to fix it – but not before documenting it, because I do find it interesting to write about these things. This site is about opinions and thinking, which often turns into criticizing or critiquing other people and the fruits of their labor. I don't talk much about my own work, so it's only fair to talk about something.

(Update: I fixed it by, after all the work is done, finding the next/prev neighbors again on the new state and comparing the ordered sequence of the next/prev neighbors before and after. If there are differences, get all of the new neighbors and reload them too – then, between the two passes, all posts that could possibly have been affected will have been reloaded. This is skipped if a post is deleted, because there's no "after" state that can conflict. This is a bit inefficient because it does all the work of the everything-about-all-posts object being created again, but it only happens on a tiny fraction of all edits and it takes very little time and CPU in the first place, so it's not worth complicating the code with tricky code prone to bugs and written only for this rare purpose to avoid.)

Another note is: why this constant frying (computing the page on demand)? Why not just bake the posts (into static HTML files)? There are two reasons for this – one is because it would complicate things slightly with templating, and the other is because I want for the pages to have the editing UI for me, but be "zero cost" to everyone else. That involves either having some sort of middleware trap to selectively serve something else, at which point I'm already taking a performance hit and might as well just serve it dynamically, or including a JavaScript for everyone into the baked page that checks for something and makes it show up, but I don't want for the page to incur that cost for most of its readers.

Previous post: Crap Store Following post: All We Want