Skip to content

Regex

regex(model, regex_str, sampler=multinomial())

Generate structured text in the language of a regular expression.

Parameters

model: An instance of Transformer that represents a model from the transformers library. regex_str: The regular expression that the output must follow. sampler: The sampling algorithm to use to generate token ids from the logits distribution.

Returns

A SequenceGeneratorAdapter instance that generates text constrained by the regular expression.

Source code in outlines/generate/regex.py
@singledispatch
def regex(model, regex_str: str, sampler: Sampler = multinomial()):
    """Generate structured text in the language of a regular expression.

    Parameters
    ----------
    model:
        An instance of `Transformer` that represents a model from the
        `transformers` library.
    regex_str:
        The regular expression that the output must follow.
    sampler:
        The sampling algorithm to use to generate token ids from the logits
        distribution.

    Returns
    -------
    A `SequenceGeneratorAdapter` instance that generates text constrained by the
    regular expression.

    """
    from outlines.processors import RegexLogitsProcessor

    logits_processor = RegexLogitsProcessor(regex_str, tokenizer=model.tokenizer)
    return SequenceGeneratorAdapter(model, logits_processor, sampler)