plexus.processors.FilterCustomerOnlyProcessor module
- class plexus.processors.FilterCustomerOnlyProcessor.FilterCustomerOnlyProcessor(**parameters)
Bases:
DataframeProcessorProcessor that filters transcript text to include only customer utterances.
This processor extracts only the portions of a transcript where the customer is speaking, removing all agent/representative utterances. It handles various speaker label formats (Customer:, Contact:, etc.).
Note: This processor does NOT remove the speaker identifiers themselves. To remove speaker labels like “Customer:”, chain this with RemoveSpeakerIdentifiersTranscriptFilter.
- Example usage in YAML:
- data:
- processors:
class: FilterCustomerOnlyProcessor
class: RemoveSpeakerIdentifiersTranscriptFilter
- process(dataframe: DataFrame) DataFrame
Process the dataframe by filtering text to customer utterances only.
- Args:
dataframe: DataFrame with ‘text’ column containing transcripts
- Returns:
DataFrame with ‘text’ column filtered to customer speech only