The AI takes the tokens it has been given, up to whatever amount of tokens you have it set to in Model Settings, and puts them through a massive mathematical process called a Neural Net, a structure of tiny pieces of code with literally billions of values called “weights”. These weights are what store the “knowledge” that the AI uses when generating text. The Neural Net basically steps through every Token it’s been given and creates a mathematical representation of what those tokens are, in the context of what sort of tokens might come next.
The Neural Net has data on the likelihood of the next token because it was created by analyzing a massive database containing terrabytes of publicly-available text, from novels, games, and other sources around the web.
Once the Neural Net has finished processing the tokens given to it, it produces its list of likely next words, along with the likelihood of them being the next one in the story, which it goes through with a statistical equation to select semi-randomly (or “sample”) a word from this list. This process can be greatly affected by your settings, like “Temperature”. This process is repeated until it has enough words to give an output, as determined by the Output Length.
NOTE: A simpler way of understanding it is like repeatedly using a giant, supercharged version of the predictive text keyboard used on phones… albeit one which can look back through over 16 text-messages worth of text, and requires a computer with quite a few times the resources of even a high-end gaming PC.