Gatsby, React

A Static Future

The magic of compile-time workflows

As static sites enjoy an incredible resurgence in popularity, I've seen a lot of misconceptions around exactly what tools like Gatsby are capable of.

Specifically, I've heard from some friends that liked the idea of using Gatsby, but worried that their project was "too dynamic". Sometimes this meant that they expected to need a database, and other times they were concerned that their site was too interactive.

After building a handful of Gatsby projects, I feel more convinced than ever that Gatsby can do just about anything. And even more than that, I think static-driven approaches are the future for dynamic, interactive, data-driven websites and web applications.

Today I'd like to talk about it.

Defining “Static”

The term “static” can be a little overloaded, and occasionally a little misleading. Here's how I'd define it:

A static website is a website where the initial HTML is prepared ahead of time, not dynamically generated by a server on request.

When you make a request to this website, for example, Netlify serves pre-generated HTML to you. I don't have a Node server dynamically rendering HTML documents on-the-fly.

This might sound very limiting to you, but my hope today is to convince you that actually, this is a have-your-🍰-and-eat-it-too situation. The benefits of static generation don't have to come at the cost of rich, dynamic applications!

It's still React

Gatsby and Next.js are both wrappers around React. Anything you can do with React can be done with Gatsby or Next.

For example, last year I built a generative art tool called Tinkersynth:

Tinkersynth is a tool that lets users create unique generative art by tweaking whimsical controls. As you might expect, there's a lot of client-side JS involved. I'm even taking advantage of specialized browser APIs; calculations are done in a web worker and painted to an OffscreenCanvas.

I initially created this project with create-react-app, and migrated it to Gatsby because I wanted to improve its load performance without spending too much time going down that rabbithole.

What does it mean to statically build a project like this? Well, here's what this page's initial HTML looks like:

Screenshot of a browser loading a page, with visible header + footer and a whimsical loading spinner

This is in contrast to the initial HTML when using create-react-app (or any other client-side-based approach):

Screenshot of a browser loading a page, with an all-blank screen

In both cases, the client receives and displays an initial HTML document while it fetches, parses, and executes our JS bundle with our React application. The difference is that our initial HTML document in a Gatsby app is much more lively; it has a full layout, and a cute custom loading spinner to indicate that stuff's happening. The perceived performance is much better; the site feels faster because we've given them something to look at while we do the work needed to let them create art.

You can still pull data from a database

What about for applications that depend on data that lives in a database?

As an example, let's say we're building an e-commerce store that sells novelty products, like this Hello Kitty® coffee machine:

Amazingly, this is a real thing which actually exists!

In a typical server-rendered application, our request might go on a journey like this:

  1. The client makes a request to /shop/hello-kitty-coffee-machine
  2. The server queries the CMS or database to retrieve info about the requested item.
  3. The server generates an HTML file for the coffee machine, highlighting all the relevant details (name, price, cat sounds it makes while percolating…).
  4. The server sends that HTML file to the client.

How would we replicate this in a static Gatsby site? We have two options.

The first option is probably the most common, and it involves a static web host for the initial content, and then some sort of API or app server:

  1. The client makes a request to /shop/hello-kitty-coffee-machine
  2. The web server immediately returns a generic HTML file. It doesn't contain any information about cats or coffee machines. Probably has some spinners.
  3. After parsing that HTML, the client makes a fetch call to the app server, to request information about the specific product
  4. The app server queries the CMS or database, and sends just the data back as JSON
  5. The client uses that JSON to update the view

This is a trade-off. On the one hand, we can paint a UI much more quickly, and allow the user to access general information. On the other, specific product info will take a bit longer to show up.

What if we didn't have to make this tradeoff though? What if we could get the fully-formed HTML document served from the web server?

Mindset shift required

When developing applications, we tend to think of everything in runtime terms. We have a "template" that can be populated with data from a database, and this process happens in a real-time, made-to-order way.

What if we could build everything at compile-time, though? What if we already had all the HTML files we needed, for every shop item?

Here's what that would look like:

  1. The client makes a request to /shop/hello-kitty-coffee-machine
  2. The web server immediately returns a complete HTML file, with all of the cat coffee machine info.

2 steps! No database lookups, no RPCs at all. We could serve rich, data-driven e-commerce platforms at the same speed as a static "hello world" site. And because this is pure HTML, it can be cached at the edge. Forget blazing fast, we're nearing blistering fast territory.

Compile-time data-fetching

To be clear: the data is still coming from a database/CMS. We still need to do the work of fetching data and generating HTML. But we can do that at compile-time, when we build our site.

When you run gatsby build (or whatever your command is to build your site for production), the build system will make all the API calls needed to fetch the data necessary to generate every possible HTML page. It'll use React server-rendering APIs to turn a tree of React components into a big HTML document.

With Gatsby, there is a rich ecosystem dedicated to this strategy. They're called source plugins. Each plugin can ingest data from a different source, and make it available to your components during the built process.

There are many many source plugins, but here's a quick sample:

There are many more of these; if you consume data from somewhere, there's probably a source plugin for it!

When you use a CI/CD service like Gatsby Cloud or Netlify, all of this work will happen in the cloud, whenever you push code to Github. Your changes will ship automatically once the build completes.

This can't possibly work… can it?

At first blush, this idea sounds absolutely ridiculous. What if you have 50,000 store items? What if you have a million? Are you proposing making a million API requests every time you build?

As things stand right now, there are indeed practical limits on how far we can push this idea. But the future is coming at us fast.

The Gatsby team has been hard at work on incremental builds. The idea with incremental builds is that you shouldn't need to do a "full" rebuild when your content or your code changes.

Let's say that our Hello Kitty coffee maker isn't selling so well. We want to offer a 20% discount for it. So we log into our CMS and enable a sale price. In an "incremental builds" world, the rebuild would take a few seconds, because we only need to rebuild a single page. We would stream these updates to our static web host, and they'd start serving the new HTML files near-instantaneously.

Code changes work the same way. If we tweak the code in a React component, we should only have to rebuild the pages that require that component to render.

Full, uncached rebuilds will take a while. If you change the header component, and that header component is shown on every page in the site, you will have to give it some time. But such updates should be rare, and I expect tooling and systems will only get better and faster. The problems in this space are a lot more solvable than we might think.

Qualitatively different

When we think about build performance—the amount of time it takes to build a new copy of the site—we often think in terms of incremental improvement. If my build can go from 10 minutes to 5 minutes, that's a nice quality-of-life boost, but ultimately it's not a qualitatively different experience.

With incremental builds, it's a different ball game. If we can push changes to large sites in a few seconds, it unlocks new doors in terms of possible architectures and workflows.

Fellow Gatsby team member Max Stoiber used to work on a tool called Spectrum, which is a community platform that mixes features from Twitter, Reddit, and Slack. It's exactly the kind of intensely-dynamic application that seems inconceivable to build statically… and yet, we're approaching the point at which such a thing is feasible. As Max puts it:

If build + deployment of new content is fast and stable enough (< 1s?) new content is available super quickly everywhere around the world, no matter where the db is—it’s just a bunch of static HTML files after all. In theory, we could’ve built Spectrum with that architecture and made it much more scalable, performant and stable! 🤯🤯🤯

A treasure trove of benefits

When we talk about static sites, the focus is usually on performance. This is for good reason, since a site that loads near-instantaneously with dynamic data is very compelling! But there's a lot more treasure in this chest.

Resilience

Let's imagine a scenario. You've been building a SaaS product for a year, and you've just launched on Product Hunt. Your product skyrockets to the #1 spot! The team is celebrating with a round of sparkling soda, congratulating each other on the launch, when someone's cellphone starts buzzing ominously. Then another. And then another.

Turns out, your servers couldn't handle the load, and Pager Duty is firing on all cylinders. At this critical moment—when the attention of the internet is on your product—the website is serving 500s to everyone. Inquisitive potential customers aren't able to reach your landing page. This is a disaster!

Scaling is a freaking hard problem, but it gets a heck of a lot easier if you're serving precompiled HTML files. They can be distributed across a thousand mirrors with services like Fastly or Cloudflare. If one of them goes down, it'll gracefully fall back to another one.

The idea that users can access dynamic data without a made-to-order database request is wild. Think about the weights that it takes off our shoulders:

  • If the database falls over, and the team can't figure out why, it's not an emergency because the site is still up and chugging along.
  • If a developer deploys a change that messes up the GraphQL API, you won't be able to build a new version of the site, but the current version has already made all the requests it needs to, so it isn't affected.

Your database servers could be teleported to an alternate dimension by an evil wizard, and your users wouldn't even know that there's a problem.

Sam covers this well in his post, Why teams using Gatsby sleep better at night.

SEO

Search engines like websites that load quickly. Fast content leads to better Google ranking on mobile devices.

You also conveniently side-step another inconvenient problem: if the page initially loads with a bunch of spinners, what if Google's robots crawl the page too soon? Google might miss all the juicy keywords that your API call injects, depriving these pages of even being eligible to rank!

For sites that depend on organic traffic, static builds can make an enormous difference.

Costs and Scaling

If a page gets 100 visits a day, and you move towards building it 10 times a day instead, you've effectively cut your database queries by 90%. This can lead towards significant cost savings!

The generated HTML pages can be cached, and serving cached content tends to be very cheap.

Limitations

There are some things that simply cannot be pre-generated. For example, content on the page specific to the currently-logged-in user, like their avatar or user name.

For pages consisting mostly of user-specific data, like an analytics dashboard, we probably don't want to do this (especially if that data is private/sensitive). But for pages with a bit of user-specific data—say, like a shopping cart on an e-commerce site—we can still take advantage of pre-rendering!

Specifically, we would render the bulk of the page during our compile, including filling in dynamic information about the product. Then, we would do a client-side fetch to request information from an API to fill in pockets of missing data, like the user's shopping cart contents. We could also rely on localStorage to populate this data more quickly.

A beanstalk

The idea of statically rendering dynamic, database-driven web applications is equal parts mind-blowing and head-scratching to me. It's a funky idea with a lot of potential, but it's still very early.

We're starting to see some rather large sites adopt this strategy, and as technology continues to unlock new potential usecases, it will be interesting to see how far this goes! I suspect this radical idea will be somewhat mainstream soon enough.

Keep reading

A front-end web development newsletter that sparks joy

My goal with this blog is to create helpful content for front-end web devs, and my newsletter is no different! It's sent twice a month, and includes to upcoming posts, and other stuff I think you'd enjoy. No spam, unsubscribe at any time.