About
DiscourseDownloader is a tool written in C++, used to download any Discourse web forum using its built-in API. It is capable of downloading all public posts, topics, and user profiles.
It starts by downloading all content in a JSON format in order to have as much complete data as possible. After all content is downloaded, this data is sent through the HTML builder - which created a ready-to-view website containing all archived content. All topics are found in their original categories, all users are listed in the directory, and all posts are preserved. Some other miscellaneous site information is also downloaded - allowing the resulting website to be as close to complete as possible.
Additionally, the HTML archive can be rebuilt at any time from the original JSON download - allowing for future customizations and improvements to take place, even if the original forum is no longer online (or if you just don't want to re-download everything).
History
DiscourseDownloader was first created in March of 2023, shortly following the announcement that the Halo Waypoint forums would be closing down. I have a great love for archival and preservation, and so hearing that this forum with nearly half a million topics would soon simply go away - all in under a month, I felt I had to do something.
When it became clear that the usual options of WinHTTrack and wget weren't going to cut it, I looked further. I learned that the new Waypoint forums were using Discourse, a relatively new forum software. I also found out that someone had made a Python script that leveraged Discourse's API to download all forum content.
Unfortunately, that script was found to be fairly barebones and somewhat buggy - it certainly wasn't up to the task of downloading that much content. So, I created this as a replacement.