As I’ve written about a number of times recently, I’ve been working a couple of word lists this past year. To do this, I created a tool called Tidy.

At this point, Tidy accepts a number of flags and options. I generally add one whenever I think of some function I might want to perform on a word list. For example, --skip-rows-end <SKIP_ROWS_END>, which “Skip[s] last number of lines from inputted files” or --ignore-after <IGNORE_AFTER_DELIMITER>, which “Ignore[s] characters after the first instance of the specified delimiter until the end of line.”

As I learned more about different qualities a word list could have, I kept generating new ones. I now “maintain” 10 distinct lists in my repo. I use the word “maintain” because I’ve found that I want to remove the same words from almost all of the lists, words like profane words, abbreviations, Roman numerals, and British spellings of common English words (assuming an American audience). Every time I find a new British spelling, or a new way to spell a profane word, I feel compelled to remove ti from all of my word lists.

To do this, I would edit my local list of profane words or British spellings, then create all of the lists anew by re-running a long Tidy command.

As mentioned, these commands could get pretty long. For example, here’s the command to re-build the basic.txt list:

tidy -AAAA --whittle-to 18250 -lL -m 3 -M 12 -a /usr/share/dict/words -r ../reject-words/profane-words.txt -r ../reject-words/roman-numerals-lower.txt -r ../reject-words/uncommon-words.txt -r ../reject-words/britishisms.txt -r ../reject-words/repeated-letters.txt -r ../reject-words/common_words.txt -r ../reject-words/mostly-abbreviations.txt --samples -o lists/basic.txt --force ../common_word_list_maker/word_list_raw.txt

Since it was difficult to remember 10 of these commands, I pasted them in a new text file. However, copying and pasting these commands out of the text file and into the command line become cumbersome, and I figured there must be a better way. That’s when I remembered Just.


Just (Github, website) is a “command runner” written in Rust. Its README describe it as a “a handy way to save and run project-specific commands,” in other words, exactly what I needed.

Now, my command for re-building basic.txt lives in a “justfile” and looks like this:

# re-build basic.txt
  tidy -AA --whittle-to 18250 -lL -z nfkd -m 3 -M 12 -a /usr/share/dict/words  --samples -o lists/basic.txt --force 

Where, earlier in the file, I define variables I use in many of my commands, including reject_commands and path_to_ngram_list.

With this in place, I can simply run just basic to force a re-build of the basic.txt word list.

Now, when I want to make a new list, I can drop the command write in the justfile and I’m all set. I can even run just --list to list all of my available just commands. Nifty!

(Note that, for now, I haven’t pushed the justfile to the word list repo, mostly to avoid confusion.)

A Just plugin for Vim

For improved Just syntax in Vim, I installed up this vim-just plugin. Though Just was usable even before I installed this plugin.

Why not use BASH/Shell?

Honestly, given my needs described above, I don’t have a good answer to this, other than the convenience of not having to deal with permissions or writing your own list function. Though I have struggled with BASH variable syntax in the past.

Epilogue: Just publishing with Jekyll

I was happy enough with Just that I added a justfile to this blog, since I kept forgetting the build and publish commands.

  bundle exec jekyll build

  git add .
  git commit -m "update"
  git push origin master

Now I can run just build publish and I’m all set!