Hey folks! It's been quite a while since my last email. A few interesting things have gone on. I gave a talk on the large migration we did at work last year, attended a wedding, and did a satisfyingly long road-trip to Goa.
But, since then, I've been mostly busy at work and with Dumbledore's Army, so I haven't had the chance to write as much online (which has had its own upsides and downsides). If you wanted to have a chat or reach out, feel free to reply to this email :D. Anyways, onto the main show:
So, we recently migrated a bunch of our documentation websites spread across 2 websites built in WordPress and GitBooks. To bring the data onto our own site, we decided to use MDX to store the static data, so that it becomes user-editable, and use the static-site generation capabilities of NextJS to host the website in a less resource-intensive way.
The basic conversion process followed any old ETL data manipulation process:
- Extract - The old data was fetched from Wordpress and Gitbooks.
- Transform - The data was batch-converted into MDX and the WP components and liquid tags were converted to MDX components.
- Load - A sitemap had to be generated from the data and be parsed to load in the pages to the static-site generator, and also generate things like the sidebar. The data was loaded via GitHub APIs into NextJS, with a revalidation time of 1 hour. This keeps the data auto-updated.
Bonus: We also used Algolia to store the search index, and used GitHub Actions to keep the index on it updated based on changes to the MDX files.
For more details on any of the above steps, check out my talk at React Bangalore about the same below.