plexus.processors.FilterCustomerOnlyProcessor module

class plexus.processors.FilterCustomerOnlyProcessor.FilterCustomerOnlyProcessor(**parameters)

Bases: DataframeProcessor

Processor that filters transcript text to include only customer utterances.

This processor extracts only the portions of a transcript where the customer is speaking, removing all agent/representative utterances. It handles various speaker label formats (Customer:, Contact:, etc.).

Note: This processor does NOT remove the speaker identifiers themselves. To remove speaker labels like “Customer:”, chain this with RemoveSpeakerIdentifiersTranscriptFilter.

Example usage in YAML:
data:
processors:
  • class: FilterCustomerOnlyProcessor

  • class: RemoveSpeakerIdentifiersTranscriptFilter

process(dataframe: DataFrame) DataFrame

Process the dataframe by filtering text to customer utterances only.

Args:

dataframe: DataFrame with ‘text’ column containing transcripts

Returns:

DataFrame with ‘text’ column filtered to customer speech only