Skip to main content

Text Classification

Open In Colab

The model architecture for LLMs is not specifically tailored to classification as it generates probabilities for each token in a fixed vocab that is far greater than the number of categories in the average classification problem. To circumvent this and still use an LLM as a classifier, we can use top-k sampling to obtain probabilities only for the categories that we'd like to predict by setting k equal to the number of categories. We can then convert token log-probabilities to class probabilities using a simple helper function.

Getting Started

Initialize PredibaseClient

First, import PredibaseClient from predibase and initialize it with your API token:

from predibase import PredibaseClient

pb = PredibaseClient(token='api token here')

Replace 'api token here' with your actual API token.

Helper Function

Before diving into examples, let's define a handy helper function to convert log probabilities to probabilities. This will make interpreting model outputs easier:

import numpy as np
from typing import List
from lorax.types import AlternativeToken

def logprobs_to_probs(tokens: List[AlternativeToken]):
logprobs = np.array([t.logprob for t in tokens])
probs = np.exp(logprobs)
probs = probs / np.sum(probs) # Normalize probabilities
return probs

Example 1: Binary Classification

Suppose you have a binary classification task where you need to determine whether a given text conveys a positive sentiment or not. Let's see how LLMs can help:

num_categories = 2

prompt = "Answer with True or False, exclusively. Is the following text positive? ### Text: The weather is great ### Response:"

resp = pb.LLM("pb://deployments/mistral-7b-instruct").generate(
prompt,
options={
'max_new_tokens': 1,
'details': True,
'do_sample': True,
'top_k': num_categories,
'return_k_alternatives': num_categories,
},
)

The LLM returns log-probabilities for the top-k tokens:

alt_tokens = resp.tokens[0].alternative_tokens
# [AlternativeToken(id=6110, text=' True', logprob=-0.015991211),
# AlternativeToken(id=8250, text=' False', logprob=-4.15625)]

Using the helper function, we can convert the log-probabilities to probabilities for each class, like we'd expect for classification:

logprobs_to_probs(alt_tokens)
# array([0.9843307, 0.0156693])

Example 2: Multi-class Classification

Now, let's explore a multi-class classification scenario. Suppose you want to classify text into positive, negative, or neutral sentiments:

num_categories = 3

prompt = "Assign a number to the following text corresponding to the sentiment: 0 if positive, 1 if negative, 2 if neutral. Do not use any other categories. ### Text: The weather is great ### Number: "

resp = pb.LLM("pb://deployments/mistral-7b-instruct").generate(
prompt,
options={
'max_new_tokens': 1,
'details': True,
'do_sample': True,
'top_k': num_categories,
'return_k_alternatives': num_categories,
},
)

The LLM returns log-probabilities for the top-k tokens:

alt_tokens = resp.tokens[0].alternative_tokens
# [AlternativeToken(id=28734, text='0', logprob=-0.02319336),
# AlternativeToken(id=28750, text='2', logprob=-4.28125),
# AlternativeToken(id=28740, text='1', logprob=-4.71875)]

Again, we can use the helper function to convert the log-probabilities to probabilities for each class:

logprobs_to_probs(alt_tokens)
# array([0.97724432, 0.01382779, 0.00892789])

Conclusion

Using Predibase for classification tasks is simple yet powerful. By leveraging large language models, you can achieve accurate results with minimal effort. Experiment with fine-tuning on your classification dataset for tailored results.