top of page

Notes on GPT-2 (345M) use with custom text

I recently followed the instructions in this article to generate custom text responses and found it to be quite helpful. There are just a few side notes that I would like to share for anyone interested in building just a little more on this great foundation.

Before running these instructions, there are a couple of other Python dependencies that must be installed. On my system, I am using Python 3, so all of my commands reflect that...

  • pip3 install numpy

  • pip3 install tensorflow=1.13.2

The 1.13.2 above is particularly important, because the examples don't like to work out-of-the-box with Tensorflow 2.

I got much better results (although slower) generating text The commands I run to actually get my text, then...

  • python3 download_model.py 345M

  • python3 src/encode.py src/training.txt training.npz --model_name 345M

  • python3 src/train.py --dataset training.npz --model_name 345M

  • python3 src/interactive_conditional_samples.py --temperature 0.8 --top_k 40 --model_name my_trained_model

As you will see in Ng Wai Foong's original article, in order to use these commands, you will have to move some folders around after training the model above. In my example, I have called it "my_trained_model". You can call it whatever you like, but your folder name must match whatever you use above.

If the results I got with GPT-2 are any indication - GPT-3 has got to be just crazy. I'm looking forward to seeing it!

 

One other point worth knowing, beyond the scope of the original article is that you can use this code to summarize text, in addition to generating new text. Doing this is profoundly simple, just run this command...

  • python3 src/interactive_conditional_samples.py --nsamples=3 --length=100 --temperature=1

And add the characters TL;DR: to the end of your input text. Of course, you can mess around with any of the parameters above on the command line. I'm just telling you a few with which I have had pretty decent results.

bottom of page