plexus.utils.quote_normalization module
Utilities for normalizing quote characters in text.
LLMs often return curly quotes (”, “, ‘, ‘) instead of straight quotes (”, ‘). This module provides functions to normalize these for consistent text matching.
- plexus.utils.quote_normalization.normalize_quotes(text: str) str
Normalize curly quotes to straight quotes for transcript matching.
Replaces Unicode curly quotes with their ASCII equivalents: - U+201C (”) LEFT DOUBLE QUOTATION MARK -> U+0022 (”) QUOTATION MARK - U+201D (”) RIGHT DOUBLE QUOTATION MARK -> U+0022 (”) QUOTATION MARK - U+2018 (’) LEFT SINGLE QUOTATION MARK -> U+0027 (’) APOSTROPHE - U+2019 (’) RIGHT SINGLE QUOTATION MARK -> U+0027 (’) APOSTROPHE
Parameters
- textstr
Text that may contain curly quotes
Returns
- str
Text with all curly quotes replaced with straight quotes
Examples
>>> normalize_quotes('The agent said "hello" and the customer replied 'yes'') 'The agent said "hello" and the customer replied 'yes''