gumx: meta #03: introducing nushell
2023-12-19

Using Nushell

Last time I had to write two build scripts, one for my current Windows machine and another for builds.sr.ht. And the need to make them one has been eating me since. True, I have enjoyed learning my way around PowerShell, but it is not for me. And I could just put everything in WSL and only write boring shell scripts, but, it’s me. So since I didn’t run out of stuff to learn and suck at, I searched for a cross-platform shell and I found it. Nushell is “A new type of shell” and it really is. I won’t go into a lot of details why I find it awesome, but you should definitly check it out if you don’t know what it is.

The Nu script

The current version of the newnu build script is here. Other than getting the markdown files and passing them to Pandoc, this script has two new features: the ability to run Nushell scripts and optionally save their output as html, and it only builds the modified files.

First thing I did this time is to define some constants, kinda like the build configuration, so I don’t have to rewrite paths everywhere. Also there are a few closures which I’ll explain what they do:

let make_list = {
    try {
        ls $"($in)/**/*" |
        where type == file |
        select name modified |
        update modified { format date '%s' | into int }
    } catch { [[name modified]; ['none' 0]] }
}

This one lists the files in a directory recursively. Since we’re only intersted in the name of the file and last modification date we select those. And then we change the date format to Unix time for easier comparison later. So in the case of the my content directory the output looks like this:

╭────┬───────────────────────────────────────────────┬────────────╮
│  # │                     name                      │  modified  │
├────┼───────────────────────────────────────────────┼────────────┤
│  0 │ content\bookmarks.md                          │ 1698834802 │
│  1 │ content\following.md                          │ 1702052445 │
│  2 │ content\index.md                              │ 1702984831 │
│  3 │ content\presentations\index.md                │ 1698835736 │
│  4 │ content\snippets.md                           │ 1698835231 │
│  5 │ content\writings\helloworld.md                │ 1699720439 │
│  6 │ content\writings\index.md                     │ 1702984812 │
│  7 │ content\writings\meta\index.md                │ 1702984793 │
│  8 │ content\writings\meta\introducing-nushell.md  │ 1702987124 │
│  9 │ content\writings\meta\pandoc-shell-scripts.md │ 1699788961 │
│ 10 │ content\writings\meta\style-writing-flow.md   │ 1698835130 │
│ 11 │ content\writings\news\genocide.md             │ 1698835005 │
│ 12 │ content\writings\news\index.md                │ 1698834959 │
╰────┴───────────────────────────────────────────────┴────────────╯
let make_output = { $in | path split | update 0 $dir.build | path join }
let make_html = { $in | path parse | update extension 'html' | path join }

The make_ouput closure replaces the first component of a file’s path, which typically is the current content directory, into the output directory. And if we need to change the extension to html, the make_html closure does just that.

The next bulk is just making different types of lists for different types of files. The first three lists are the types just the types of input files and their expected output files.

let static_list = $dir.static | do $make_list |
    insert type 'static' |
    each { insert output ($in.name | do $make_output) }

The static_list is a the contents of the static directory, which should be copied as is in the output directory. It looks like this:

╭───┬───────────────────┬────────────┬────────┬──────────────────╮
│ # │       name        │  modified  │  type  │      output      │
├───┼───────────────────┼────────────┼────────┼──────────────────┤
│ 0 │ static\pubkey.asc │ 1693469865 │ static │ build\pubkey.asc │
│ 1 │ static\style.css  │ 1699788918 │ static │ build\style.css  │
╰───┴───────────────────┴────────────┴────────┴──────────────────╯

Next is the content_list:

let content_list = $dir.content | do $make_list |
    each {
        if ($in.name | path parse).extension in $default.ext.content {
            insert type 'content' |
            insert output ($in.name | do $make_output | do $make_html)
        } else {
            insert type 'asset' |
            insert output ($in.name | do $make_output)
        }
    }

This checks the contents of content directory, duh!, and for any markdown file it gives it a content type and changes it’s output extension to html. Otherwise it an asset file and is treated the same as static files. The output looks like this:

╭────┬───────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────────────────────────╮
│  # │                     name                      │  modified  │  type   │                    output                     │
├────┼───────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────────────────────────┤
│  0 │ content\bookmarks.md                          │ 1698834802 │ content │ build\bookmarks.html                          │
│  1 │ content\following.md                          │ 1702052445 │ content │ build\following.html                          │
│  2 │ content\index.md                              │ 1702984831 │ content │ build\index.html                              │
│  3 │ content\presentations\index.md                │ 1698835736 │ content │ build\presentations\index.html                │
│  4 │ content\snippets.md                           │ 1698835231 │ content │ build\snippets.html                           │
│  5 │ content\writings\helloworld.md                │ 1699720439 │ content │ build\writings\helloworld.html                │
│  6 │ content\writings\index.md                     │ 1702984812 │ content │ build\writings\index.html                     │
│  7 │ content\writings\meta\index.md                │ 1702984793 │ content │ build\writings\meta\index.html                │
│  8 │ content\writings\meta\introducing-nushell.md  │ 1702987663 │ content │ build\writings\meta\introducing-nushell.html  │
│  9 │ content\writings\meta\pandoc-shell-scripts.md │ 1699788961 │ content │ build\writings\meta\pandoc-shell-scripts.html │
│ 10 │ content\writings\meta\style-writing-flow.md   │ 1698835130 │ content │ build\writings\meta\style-writing-flow.html   │
│ 11 │ content\writings\news\genocide.md             │ 1698835005 │ content │ build\writings\news\genocide.html             │
│ 12 │ content\writings\news\index.md                │ 1698834959 │ content │ build\writings\news\index.html                │
╰────┴───────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────────────────────────╯

The last input list type is script, this one does a little bit more work:

let scripts_list = $dir.scripts | do $make_list |
    each {
        if ($in.name | path split).1 == 'content' {
            if ($in.name | path parse).extension in $default.ext.script {
                insert type 'script' |
                insert output ($in.name | path relative-to $dir.scripts | do $make_output | do $make_html)
            } else {
                insert type 'asset' |
                insert output ($in.name | path relative-to $dir.scripts | do $make_output)
            }
        } else if ($in.name | path split).1 in ['before' 'after'] {
            if ($in.name | path parse).extension in $default.ext.script {
                insert type ($in.name | path split).1
            } else {
                insert type 'ignore'
            } |
            insert output 'none'
        } else {
            insert type 'ignore' |
            insert output 'none'
        }
    }

It checks for files in three subdirectories inside scripts:

Currently I have only one script, so the list looks like this:

╭───┬──────────────────────┬────────────┬───────┬────────╮
│ # │         name         │  modified  │ type  │ output │
├───┼──────────────────────┼────────────┼───────┼────────┤
│ 0 │ scripts\after\log.nu │ 1702983671 │ after │ none   │
╰───┴──────────────────────┴────────────┴───────┴────────╯

The three lists are combined in one source_list like so:

let source_list = [$static_list $content_list $scripts_list] |
    flatten |
    rename --column {name: source}

Then build_list is created, listing the existing files in build directory and their creation timestamp.

let build_list = $dir.build |
    do $make_list |
    rename output timestamp

The output for my current workspace looks like this:

╭────┬───────────────────────────────────────────────┬────────────╮
│  # │                    output                     │ timestamp  │
├────┼───────────────────────────────────────────────┼────────────┤
│  0 │ build\bookmarks.html                          │ 1702983234 │
│  1 │ build\following.html                          │ 1702983234 │
│  2 │ build\index.html                              │ 1702983235 │
│  3 │ build\log                                     │ 1702983598 │
│  4 │ build\presentations\index.html                │ 1702983235 │
│  5 │ build\pubkey.asc                              │ 1693469865 │
│  6 │ build\snippets.html                           │ 1702983236 │
│  7 │ build\style.css                               │ 1699788918 │
│  8 │ build\writings\helloworld.html                │ 1702983236 │
│  9 │ build\writings\index.html                     │ 1702983237 │
│ 10 │ build\writings\meta\index.html                │ 1702983237 │
│ 11 │ build\writings\meta\pandoc-shell-scripts.html │ 1702983238 │
│ 12 │ build\writings\meta\style-writing-flow.html   │ 1702983238 │
│ 13 │ build\writings\news\genocide.html             │ 1702983238 │
│ 14 │ build\writings\news\index.html                │ 1702983238 │
╰────┴───────────────────────────────────────────────┴────────────╯

After that we join source_list and build_list to link source files with their respective outputs.

let joined_list = $source_list | join -o $build_list output |
    default 'none' source |
    default 'none' type |
    default 0 modified |
    default 0 timestamp |
    each {
        insert url (
            match $in.output {
                'none' => 'none'
                _ => (
                    $in.output |
                    path split | update 0 $site.url |
                    take until {|this| $this == 'index.html' } |
                    str join '/' |
                    url encode
                )
            }
        )
    }

The resulting table will help us later to know what should be updated and what shouldn’t.

╭────┬─────────────────────────────────┬────────────┬─────────┬──────────────────────────────────┬────────────┬──────────────────────────────────╮
│  # │             source              │  modified  │  type   │              output              │ timestamp  │               url                │
├────┼─────────────────────────────────┼────────────┼─────────┼──────────────────────────────────┼────────────┼──────────────────────────────────┤
│  0 │ static\pubkey.asc               │ 1693469865 │ static  │ build\pubkey.asc                 │ 1693469865 │ localhost/pubkey.asc             │
│  1 │ static\style.css                │ 1699788918 │ static  │ build\style.css                  │ 1699788918 │ localhost/style.css              │
│  2 │ content\bookmarks.md            │ 1698834802 │ content │ build\bookmarks.html             │ 1702983234 │ localhost/bookmarks.html         │
│  3 │ content\following.md            │ 1702052445 │ content │ build\following.html             │ 1702983234 │ localhost/following.html         │
│  4 │ content\index.md                │ 1702984831 │ content │ build\index.html                 │ 1702983235 │ localhost                        │
│  5 │ content\presentations\index.md  │ 1698835736 │ content │ build\presentations\index.html   │ 1702983235 │ localhost/presentations          │
│  6 │ content\snippets.md             │ 1698835231 │ content │ build\snippets.html              │ 1702983236 │ localhost/snippets.html          │
│  7 │ content\writings\helloworld.md  │ 1699720439 │ content │ build\writings\helloworld.html   │ 1702983236 │ localhost/writings/helloworld.ht │
│    │                                 │            │         │                                  │            │ ml                               │
│  8 │ content\writings\index.md       │ 1702984812 │ content │ build\writings\index.html        │ 1702983237 │ localhost/writings               │
│  9 │ content\writings\meta\index.md  │ 1702984793 │ content │ build\writings\meta\index.html   │ 1702983237 │ localhost/writings/meta          │
│ 10 │ content\writings\meta\introduci │ 1702988713 │ content │ build\writings\meta\introducing- │          0 │ localhost/writings/meta/introduc │
│    │ ng-nushell.md                   │            │         │ nushell.html                     │            │ ing%2Dnushell.html               │
│ 11 │ content\writings\meta\pandoc-sh │ 1699788961 │ content │ build\writings\meta\pandoc-shell │ 1702983238 │ localhost/writings/meta/pandoc%2 │
│    │ ell-scripts.md                  │            │         │ -scripts.html                    │            │ Dshell%2Dscripts.html            │
│ 12 │ content\writings\meta\style-wri │ 1698835130 │ content │ build\writings\meta\style-writin │ 1702983238 │ localhost/writings/meta/style%2D │
│    │ ting-flow.md                    │            │         │ g-flow.html                      │            │ writing%2Dflow.html              │
│ 13 │ content\writings\news\genocide. │ 1698835005 │ content │ build\writings\news\genocide.htm │ 1702983238 │ localhost/writings/news/genocide │
│    │ md                              │            │         │ l                                │            │ .html                            │
│ 14 │ content\writings\news\index.md  │ 1698834959 │ content │ build\writings\news\index.html   │ 1702983238 │ localhost/writings/news          │
│ 15 │ scripts\after\log.nu            │ 1702983671 │ after   │ none                             │          0 │ none                             │
│ 16 │ none                            │          0 │ none    │ build\log                        │ 1702983598 │ localhost/log                    │
╰────┴─────────────────────────────────┴────────────┴─────────┴──────────────────────────────────┴────────────┴──────────────────────────────────╯

Notice the 10th entry, content\writings\meta\introducing-nushell.md has an output file but the timestamp is 0 since it isn’t created yet. The next list is a bit interesting.

let changed_urls = $joined_list |
    where { $in.url != 'none' and $in.modified > $in.timestamp } |
    get url |
    each {
        split row '/' |
        reduce { |it, acc|
            $acc | append ( [(try { $acc | last } catch { $acc }) $it] | str join '/' )
        }
    } |
    flatten | uniq

I actually changed some other files but if I didn’t, the only modified/new file will be this entry content\writings\meta\introducing-nushell.md. Later I’d like section indexes in the site to automatically list posts and pages, this list helps identifying where changes should be made. Basically, for updated pages, update parent pages up to root. So the output of this list is:

╭───┬────────────────────────────────────────────────────╮
│ 0 │ localhost                                          │
│ 1 │ localhost/writings                                 │
│ 2 │ localhost/writings/meta                            │
│ 3 │ localhost/writings/meta/introducing%2Dnushell.html │
╰───┴────────────────────────────────────────────────────╯

Then we get to actions tables:

let action_before = $joined_list |
    where type == 'before' |
    insert action 'run' |
    select action source output |
    sort-by --ignore-case source

let action_after = $joined_list |
    where type == 'after' |
    insert action 'run' |
    select action source output |
    sort-by --ignore-case source

These pick the before and after scripts and marks the action as run, which later will just runs the script in a subshell.

let action_copy = $joined_list |
    where { $in.type in ['asset' 'static'] and $in.modified > $in.timestamp } |
    insert action 'copy' |
    select action source output |
    sort-by --ignore-case source

The copy action is for assets and static files that have a modification date after their respective output’s timestamp.

let action_update = $joined_list |
    where { $in.type in ['content' 'script'] and $in.url in $changed_urls } |
    each { match $in.type { 'content' => (insert action 'update'), 'script' => (insert action 'save') } } |
    select action source output |
    sort-by --ignore-case source

action_update looks for content files and scripts that have a changed url. We’ll use the update action later to know which files should be passed to Pandoc, and save for scripts with output redirected to a specific html file.

let action_delete = $joined_list |
    where type == 'none' |
    rename --column {output: 'source', source: 'output'} |
    insert action 'delete' |
    select action source output |
    sort-by --ignore-case source

The delete action is for files in build directory that no longer have a respective input file. So it should be deleted. Notice we swapped the columns source and output for this one.

let action_ignore = $joined_list |
    where { $in.type not-in ['none' 'before' 'after'] and $in.url not-in $changed_urls } |
    insert action 'ignore' |
    select action source output |
    sort-by --ignore-case source

The last action to be mapped is ignore is for anything that won’t be processed. I’m keeping it for now for debugging. So finally the combined actions looks like this:

╭────┬────────┬───────────────────────────────────────────────┬───────────────────────────────────────────────╮
│  # │ action │                    source                     │                    output                     │
├────┼────────┼───────────────────────────────────────────────┼───────────────────────────────────────────────┤
│  0 │ update │ content\index.md                              │ build\index.html                              │
│  1 │ update │ content\writings\index.md                     │ build\writings\index.html                     │
│  2 │ update │ content\writings\meta\index.md                │ build\writings\meta\index.html                │
│  3 │ update │ content\writings\meta\introducing-nushell.md  │ build\writings\meta\introducing-nushell.html  │
│  4 │ delete │ build\log                                     │ none                                          │
│  5 │ ignore │ content\bookmarks.md                          │ build\bookmarks.html                          │
│  6 │ ignore │ content\following.md                          │ build\following.html                          │
│  7 │ ignore │ content\presentations\index.md                │ build\presentations\index.html                │
│  8 │ ignore │ content\snippets.md                           │ build\snippets.html                           │
│  9 │ ignore │ content\writings\helloworld.md                │ build\writings\helloworld.html                │
│ 10 │ ignore │ content\writings\meta\pandoc-shell-scripts.md │ build\writings\meta\pandoc-shell-scripts.html │
│ 11 │ ignore │ content\writings\meta\style-writing-flow.md   │ build\writings\meta\style-writing-flow.html   │
│ 12 │ ignore │ content\writings\news\genocide.md             │ build\writings\news\genocide.html             │
│ 13 │ ignore │ content\writings\news\index.md                │ build\writings\news\index.html                │
│ 14 │ ignore │ static\pubkey.asc                             │ build\pubkey.asc                              │
│ 15 │ ignore │ static\style.css                              │ build\style.css                               │
│ 16 │ run    │ scripts\after\log.nu                          │ none                                          │
╰────┴────────┴───────────────────────────────────────────────┴───────────────────────────────────────────────╯

Now, it’s time to execute these actions:

for entry in $actions {
    print $">> ($entry.action): ($entry.source) -> ($entry.output)"
    if $entry.output != 'none' {
        mkdir ($entry.output | path dirname)
    }
    match $entry.action {
        'run' => (nu $entry.source)
        'copy' => (cp $entry.source $entry.output)
        'update' => (pandoc --standalone --template ([$dir.templates $default.template] | path join) --output $entry.output $entry.source)
        'save' => (nu $entry.source | save $entry.output)
        'delete' => (rm -f $entry.source)
        'ignore' => (print $"($entry.source) ignored")
        _ => (print $"Unrecognized action <($entry.action)>")
    }
}

It’s just a loop that checks the action in each entry and runs the needed command in a subshell. And if you’re reading this, it means that it was successful.

Till next time :)