Will Murphy's personal home page

Adding Comments Part 2: Post IDs

The first step of my plan to roll my own comment engine, is to make sure that each post has a UUID accessible to client-side scripts.

I could run something like this python snippet:

import uuid
print(str(uuid.uuid4()))

once for each post I have, and then paste the output into the front matter of each of my posts, but I want to see whether this can be easily automated using tools that Hugo gives me.

We can get part way there using an Archetype Template for Hugo, since that gives us some code that runs every time I run hugo new posts/some-weird-idea.md, but I doubt it supports UUIDs right out of the box. My doubt is strengthened when I find someone who is asking for this exactly but has not been answered since April 2018.

The .File.UniqueID built-in variable is a lot like what I’m looking for, but there are a couple things I don’t like about it. It’s defined as “the MD5-checksum of the content file’s path.” This seems to have the goal of always being the same for content at the same path, but that’s almost the opposite of what I want; I want something that will stay the same even if I rename a post, and if I change a post’s file path, the MD5 of the file path will also change. It would be cool if I could get .File.UniqueID once, and then persist.

It looks like the way to do that is to add it to archetypes/post.md:

diff --git a/archetypes/posts.md b/archetypes/posts.md
index 5ccf1c2..0bd8955 100644
--- a/archetypes/posts.md
+++ b/archetypes/posts.md
@@ -3,5 +3,7 @@ title: "{{ replace .Name "-" " " | title }}"
 date: {{ .Date }}
 draft: true
 layout: single
+post_id: {{ .File.UniqueID }}
+
 ---

Now what will happen is:

  1. I run hugo new posts/some-cool-idea.md
  2. Hugo creates a file at content/posts/some-cool-idea.md
  3. Hugo takes an MD5 of the string "content/posts/some-cool-idea.md" and puts it in that file
  4. When I commit the new post to source control, the MD5 is there, and won’t change unless I change it

Let’s check whether this meets my requirements:

  1. Is it automatic? Yes, every time I run hugo new it will put a new ID in the resulting file
  2. Is it unique? Probably; hash functions can collide, but we’re talking dozens or (optimistically) hundres of posts, so it probably won’t
  3. Is it persistent? Yes, we’re committing the MD5 output to source control, so even if I rename the file later the post ID will be the same.

Probably the biggest drawback is that I can cause a collision accidentally: I could run hugo new posts/cool-post.md then decide to rename the file at content/posts/cool-post.md, then run hugo new posts/cool-post.md again, and I would have a collision. I’m pretty sure that just won’t happen. If I was super worried about it, I could write a script that fails the build if two posts have the same ID. I will probably do that at some point, because I’m a big proponent of having automated steps prove that a change is correct to deploy as far as possible, but I don’t think I need to do that right away.

So that solves the problem for new posts! By my count, I have 2 ID-related problems left: getting an ID onto old posts, and sending the ID out in the HTML so that it’s visible on the client side. But we’ll tackle that next time.

Till then, happy learning!
– Will

Discussion

Love this post? Hate it? Is someone wrong on the internet? Is it me? Please feel free to @me on mastodon

I used to host comments on the blog, but have recently stopped. Previously posted comments will still appear, but for new ones please use the mastodon link above.

Join the conversation!