Describe the bug
ISO 8601 with positive or 00 offset is not correctly recognized as DATE_TIME
To Reproduce
1️⃣ Start Analyzer Docker Image
2️⃣ Send this request:
{
"text": "'Time with timezone offset (+2 hours)': 2024-03-15T14:30:00+02:00\r\n'End of year with explicit UTC offset': 2024-12-31T23:59:59+00:00",
"language": "en"
}
3️⃣ Text (when processed by the Anonymizer) is:
'Time with timezone offset (+2 <DATE_TIME>)': <DATE_TIME>+02:00\r\n'End of year with explicit UTC offset': <DATE_TIME>+00:00"
Expected behavior
I would expect the response text to be:
'Time with timezone offset (+2 <DATE_TIME>)': <DATE_TIME>\r\n'End of year with explicit UTC offset': <DATE_TIME>"
Question
Currently, by using the default Docker image, these DATE_TIMEs are detected by the SpacyRecognizer.
Wouldn't it be a better solution to update the existing DateRecognizer to also include the ISO 8601 as a regex?
Describe the bug
ISO 8601 with positive or 00 offset is not correctly recognized as DATE_TIME
To Reproduce
1️⃣ Start Analyzer Docker Image
2️⃣ Send this request:
{ "text": "'Time with timezone offset (+2 hours)': 2024-03-15T14:30:00+02:00\r\n'End of year with explicit UTC offset': 2024-12-31T23:59:59+00:00", "language": "en" }3️⃣ Text (when processed by the Anonymizer) is:
Expected behavior
I would expect the response text to be:
Question
Currently, by using the default Docker image, these DATE_TIMEs are detected by the
SpacyRecognizer.Wouldn't it be a better solution to update the existing DateRecognizer to also include the ISO 8601 as a regex?