gumx: meta #02: pandoc templates & shell scripts
2023-11-11

Using Pandoc Templates & Shell Scripts

I know I’ve said that I won’t be using any preprocessor. It is still my end game, but for the sake of making things a bit easier while going there, I will be using Pandoc.

About Pandoc

Pandoc is a great tool that is basically a universal document converter. It is available for all major operating systems and doesn’t require additional dependencies, patches or virtualization. My needs are pretty basic: convert markdown to HTML using my own templates, and for it to work on both my Windows laptop and on whatever build system online (currently builds.sr.ht) I would use to generate the site. Pandoc has a lot more to offer other than that, but these are the reasons why I went with that option.

The template

Basically the same as the old one but utilizing Pandoc template variables to change the title and the content.

<title>gumx${if(section)}: ${section}${if(order)} #${order}${endif}${endif}${if(title)}: ${title}${endif}</title>

What I did there is that I used a couple of custom variables section and order in case the post I’m writing is a part of a series, and the built-in variable title. Let me demonstrate:

This post has the following frontmatter block:

---
title: pandoc templates & shell scripts
section: meta
order: '02'
date: 2023-11-11
---

What Pandoc does is that it treats HTML code as, well HTML, it prints it as is until if finds a template tag, in my case it’s ${if(section)}. This is an if statement like the one you find in your ordinary programming language. If the variable section exists, it proceeds with printing what follows until it reaches the ${endif} closing tag. This line has two nested if statements, one for the section name and the other for the order of the article. So after rendering I will get this:

<title>gumx: meta #02: pandoc templates &amp; shell scripts</title>

The &amp; is just Pandoc making the character & HTML-safe by converting it to its entity code.

The other tag I use is ${body} which just copies the content after converting it to HTML.

The build scripts

Pandoc only converts from markdown to HTML, so what I have to do now is feed it the markdown files and tell it where to put the output. From Pandoc documentation, what I need to do is this:

pandoc --standalone --template template.html --output post-title.html post-title.md

Assuming that template.html is the template file, and post-title.md is the markdown file which we want to convert to post-title.html. What I did is that I made a loop that goes through the content directory and finds the correct output file directory, and gives these parameters to Pandoc command.

# ./build.sh
find ./content -type f -print0 | while read -d $'\0' inputFile; do
    inputDir=$( dirname $inputFile )
    outputDir="./build${inputDir#./content}"
    inputExtension="${inputFile##*.}"
    inputFilename=$( basename ${inputFile} )

    mkdir -p "${outputDir}"

    if [ 'md' = "${inputExtension}" ]; then
        outputFilename="${inputFilename/.md}.html"
        pandoc --standalone --template "${defaultTemplate}" --output "${outputDir}/${outputFilename}" "${inputFile}"
    else
        cp "${inputFile}" "${outputDir}/${outputFilename}"
    fi
done

And then I had to make the same shell-fu but with PowerShell to be able to view my writings on my Windows laptop.

# .\build.ps1
$contetnDir = $(Get-Item -Path .\content).FullName
$buildDir = $(Get-Item -Path .\build).FullName
$defaultTemplate = $(Get-Item -Path .\templates\default.html).FullName

Get-ChildItem -Path .\content -Recurse -File | ForEach-Object {
    $outputDir = $(New-Item -Path $_.DirectoryName.Replace($contetnDir, $buildDir) -ItemType Directory -Force).FullName
    if ($_.Extension -eq ".md") {
        pandoc.exe --standalone --template $defaultTemplate --output "$($outputDir)\$($_.BaseName).html" "$($_.FullName)"
    } else {
        Copy-Item -Path "$($_.FullName)" -Destination "$($outputDir)\$($_.Name)"
    }
}

Both the loops do the same thing:

  1. Make a list of the files in content directory.
  2. Do some text manipulation to substitute content with build in each of the files paths.
  3. If the input file has the extension .md it feeds it to Pandoc, providing the new path and changing the extension to .html
  4. If the intput file does not have the extension .md it copies it to the new location as is.
  5. It creates the parent directories for the created file if needed.

Current versions of the files can be found here and here.

The build manifest

The last thing to do is to run it on builds.sr.ht and I only needed to change the line where I simply copied the HTML files into this . build.sh which calls the build script and arranges everything in the pretty build directory. The current version of the build manifest is here as well.

What’s next

I still write the post lists myself, so the next thing is trying to automate that. I’ll also try to make templates for other types of content, like the presentations, bookmarks list, and my projects list. And I will definitely try to write about other projects I’m working on. Till next time :)