plexus.utils.quote_normalization module

Utilities for normalizing quote characters in text.

LLMs often return curly quotes (”, “, ‘, ‘) instead of straight quotes (”, ‘). This module provides functions to normalize these for consistent text matching.

plexus.utils.quote_normalization.normalize_quotes(text: str) str

Normalize curly quotes to straight quotes for transcript matching.

Replaces Unicode curly quotes with their ASCII equivalents: - U+201C (”) LEFT DOUBLE QUOTATION MARK -> U+0022 (”) QUOTATION MARK - U+201D (”) RIGHT DOUBLE QUOTATION MARK -> U+0022 (”) QUOTATION MARK - U+2018 (’) LEFT SINGLE QUOTATION MARK -> U+0027 (’) APOSTROPHE - U+2019 (’) RIGHT SINGLE QUOTATION MARK -> U+0027 (’) APOSTROPHE

Parameters

textstr

Text that may contain curly quotes

Returns

str

Text with all curly quotes replaced with straight quotes

Examples

>>> normalize_quotes('The agent said "hello" and the customer replied 'yes'')
'The agent said "hello" and the customer replied 'yes''