The AugmentationConfig class defines the parameters used for augmenting datasets. This configuration is used to specify the type of augmentation to perform, the number of augmentations to generate, and other augmentation options.

Parameters

ParameterTypeRequiredDefaultDescription
base_modelstringYesThe OpenAI model to use for generating synthetic examples. Must be one of:
  • gpt-4-turbo
  • gpt-4-0125-preview
  • gpt-4-1106-preview
  • gpt-4o
  • gpt-4o-2024-08-06
  • gpt-4o-mini
num_samples_to_generateintegerNo1000The number of synthetic examples to generate. Must be greater than or equal to 1.
num_seed_samplesinteger/stringNoallThe number of seed samples to use.
Can be an integer >= 1 or all to use all available samples.
augmentation_strategystringNomixture_of_agentsThe strategy to use for augmentation.
Must be either single_pass or mixture_of_agents.
task_contextstringNo""User-provided task context for generating candidates.
Helps guide the augmentation process.

Example Usage

from predibase import AugmentationConfig

# Basic configuration
config = AugmentationConfig(
    base_model="gpt-4-turbo"
)

# Advanced configuration
config = AugmentationConfig(
    base_model="gpt-4-turbo",
    num_samples_to_generate=500,
    num_seed_samples=10,
    augmentation_strategy="mixture_of_agents",
    task_context="Generate diverse examples for sentiment analysis"
)