Describe the bug
The deny list capability only detects terms that have white space before and after.
To Reproduce
from presidio_analyzer import PatternRecognizer
deny_list = ["Mr", "Mrs", "Ms", "Dr", "Prof"]
deny_list_recognizer = PatternRecognizer(deny_list=deny_list, supported_entity="TITLES")
deny_list_recognizer.analyze(text="Mr Smith", entities=["TITLES"])
Would result in:
[type: TITLES, start: 0, end: 2, score: 1.0]
Running:
deny_list_recognizer.analyze(text="Mr. Smith", entities=["TITLES"])
Would result in nothing detected.
Expected behavior
Special characters such as .,:;!? before and after a term should also be used to detect a deny list term
Describe the bug
The deny list capability only detects terms that have white space before and after.
To Reproduce
Would result in:
Running:
Would result in nothing detected.
Expected behavior
Special characters such as .,:;!? before and after a term should also be used to detect a deny list term