Matthew's Weblog

Using an LLM to help sanitise word lists for a game

I've been working on and off for a while now on a word game for the iPhone. One method of play is against a daily word list consisting of twenty words that you have to solve anagrams for. I'm generating the word lists using a Python script and bundling it with the app to prevent the need for a web service. The script was doing a good job at generating words of increasing difficulty; however, some of the words it was choosing could cause offence. This afternoon I had a little time away from client work to see if an LLM could help with this.

I decided to use Ollama so everything could be run locally. Setting up Ollama on my MacBook was simple, and I chose the model Llama 3 as a base for my experiment. My first task was to come up with a suitable prompt to help sanitise the word lists. After more experimentation than I thought would have been necessary, I came up with:

'I am going to ask you whether or not a word is suitable to inclusion in a word list. The word list is to be used in a word game. There should be no words in the word list which may cause offence. Examples of words which may cause offence are XXX, XXX, XXX, XXX, XXX. The words should be known by most English speaking adults. For the next question respond with only a yes or no answer.'

(I've blanked out the potentially offensive words above, but gave a few that weren't necessarily really bad but just ones I didn't want appearing in a word game suitable for all ages.)

If a word was identified as offensive, or just one that wouldn't be known by most adults, then the script generates another word and tries again.

So far, this seems to have produced some pretty good word lists. Whether or not I ever get around to finishing this game is a question for another day, but I feel like this has brought me one step closer to release by resolving the issue of requiring too much manual input in supporting the daily play mode of a game from which I'm unlikely to make much money.


Recent posts