Text Classification
The model architecture for LLMs is not specifically tailored to classification as it generates probabilities for each token in a fixed vocab that is far greater than the number of categories in the average classification problem. To circumvent this and still use an LLM as a classifier, we can use top-k sampling to obtain probabilities only for the categories that we'd like to predict by setting k
equal to the number of categories. We can then convert token log-probabilities to class probabilities using a simple helper function.
Getting Started
Initialize PredibaseClient
First, import PredibaseClient
from predibase
and initialize it with your API token:
from predibase import PredibaseClient
pb = PredibaseClient(token='api token here')
Replace 'api token here'
with your actual API token.
Helper Function
Before diving into examples, let's define a handy helper function to convert log probabilities to probabilities. This will make interpreting model outputs easier:
import numpy as np
from typing import List
from lorax.types import AlternativeToken
def logprobs_to_probs(tokens: List[AlternativeToken]):
logprobs = np.array([t.logprob for t in tokens])
probs = np.exp(logprobs)
probs = probs / np.sum(probs) # Normalize probabilities
return probs
Example 1: Binary Classification
Suppose you have a binary classification task where you need to determine whether a given text conveys a positive sentiment or not. Let's see how LLMs can help:
num_categories = 2
prompt = "Answer with True or False, exclusively. Is the following text positive? ### Text: The weather is great ### Response:"
resp = pb.LLM("pb://deployments/mistral-7b-instruct").generate(
prompt,
options={
'max_new_tokens': 1,
'details': True,
'do_sample': True,
'top_k': num_categories,
'return_k_alternatives': num_categories,
},
)
The LLM returns log-probabilities for the top-k tokens:
alt_tokens = resp.tokens[0].alternative_tokens
# [AlternativeToken(id=6110, text=' True', logprob=-0.015991211),
# AlternativeToken(id=8250, text=' False', logprob=-4.15625)]
Using the helper function, we can convert the log-probabilities to probabilities for each class, like we'd expect for classification:
logprobs_to_probs(alt_tokens)
# array([0.9843307, 0.0156693])
Example 2: Multi-class Classification
Now, let's explore a multi-class classification scenario. Suppose you want to classify text into positive, negative, or neutral sentiments:
num_categories = 3
prompt = "Assign a number to the following text corresponding to the sentiment: 0 if positive, 1 if negative, 2 if neutral. Do not use any other categories. ### Text: The weather is great ### Number: "
resp = pb.LLM("pb://deployments/mistral-7b-instruct").generate(
prompt,
options={
'max_new_tokens': 1,
'details': True,
'do_sample': True,
'top_k': num_categories,
'return_k_alternatives': num_categories,
},
)
The LLM returns log-probabilities for the top-k tokens:
alt_tokens = resp.tokens[0].alternative_tokens
# [AlternativeToken(id=28734, text='0', logprob=-0.02319336),
# AlternativeToken(id=28750, text='2', logprob=-4.28125),
# AlternativeToken(id=28740, text='1', logprob=-4.71875)]
Again, we can use the helper function to convert the log-probabilities to probabilities for each class:
logprobs_to_probs(alt_tokens)
# array([0.97724432, 0.01382779, 0.00892789])
Conclusion
Using Predibase for classification tasks is simple yet powerful. By leveraging large language models, you can achieve accurate results with minimal effort. Experiment with fine-tuning on your classification dataset for tailored results.