Tasks
Tasks
Large language models are trained on large amounts of text, with the purpose to learn the original patterns ( i.e. archetypes ) of language. Subsequently there are phases of task-specific training that give emergence to human-like language intelligence observed in today's state of the art models. There two larger categories of prompts completion and instruction. Completion is the most basic form of operation, whereas instruction is completion as well it instructs the model to perform a completion on a task related subspace of operations.
Word Prediction
The basic task that all LLMs are trained upon is word prediction. GPT-style models just predict the next word, whereas BERT-type models predict the word in some position in the sentence. It does seem that GPT-style models better resemble how humans think, i.e. what is the next word?
Completion
Generative LLMs perform completion of the input text based on the text they have been fed during training. They produce the next token based on how frequently it has been seen in the training set in the same relative position with respect to the previous words ( or more specifically tokens )
Input:
Output:
Instruction
Instruction based training induces to LLMs a command-type of vector subspace, allowing them to perform tasks on their input, instead of just performing a single task such as completion.
So a question answering prompt could be:
Input:
Output:
The instruction "answer the question" drives the model to the task-specific subspace of question answering which conditions the probabilities of the next tokens for the give task.