Conversation
Do not add "T" separator if it's not in the authored HTML. Normalize timezone offsets to remove colons Imply timezone offset if included in another dt- in the same microformat
Mf2/Parser.php
Outdated
| foreach ($dateParts as $part) { | ||
| // Is this part a full ISO8601 datetime? | ||
| if (preg_match('/^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}(?::\d{2})?(?:Z?[+|-]\d{2}:?\d{2})?$/', $part)) { | ||
| if (preg_match('/^\d{4}-\d{2}-\d{2}T?\d{2}:\d{2}(?::\d{2})?(?:Z?[+|-]\d{2}:?\d{2})?$/', $part)) { |
There was a problem hiding this comment.
Why the added ? here? That makes the T optional. Should this instead be checking for a as well as T?
Mf2/Parser.php
Outdated
| } elseif (preg_match('/\d{4}-\d{2}-\d{2}/', $part) and empty($datePart)) { | ||
| // Is the current part a valid date AND no other date representation has been found? | ||
| $datePart = $part; | ||
| } elseif (preg_match('/(Z?[+|-]\d{2}:?\d{2})/', $part) and empty($timezonePart)) { |
There was a problem hiding this comment.
This should match Z OR +00:00, so probably a better regex would be
/(Z|[+-]\d{2}:?\d{2})/
There was a problem hiding this comment.
Since the timezone may also only be hours, updated this to /^(Z|[+-]\d{2}:?(\d{2})?)$/
Mf2/Parser.php
Outdated
| * form the full date-time using the most recently parsed dt- value | ||
| */ | ||
| if ((preg_match('/^\d{1,2}:\d{1,2}(Z?[+|-]\d{2}:?\d{2})?/', $dtValue) or preg_match('/^\d{1,2}[a|p]m/', $dtValue)) && !empty($dates)) { | ||
| if ((preg_match('/^\d{1,2}:\d{1,2}(Z?[+|-]\d{2}:?\d{2}?)?/', $dtValue) or preg_match('/^\d{1,2}[a|p]m/', $dtValue)) && !empty($dates)) { |
There was a problem hiding this comment.
What's the reason for making the minutes part optional here? It looks like the change is adding the ? to the minutes part of the timezone offset. Like the above, this should probably be changed to (Z|[+-]\d{2}:?\d{2})
I know this wasn't your change, but [a|p] and [+|-] is incorrect, since the regex is using [] you only need [ap] and [+-].
There was a problem hiding this comment.
Hour-only timezone offsets (+XX, -XX) are valid according to the VCP spec. I updated the am/pm regex in both locations to handle optional periods as well [ap]\.?m\.?
Edit: Re-reading, hour-only timezone offsets appear to only be valid if they appear in their own VCP element, separate from the time. Not sure if that's an oversight in the time+timezone section or not.
Mf2/Parser.php
Outdated
| foreach ($temp_dates as $propName => $data) { | ||
| foreach ( $data as $dtValue ) { | ||
|
|
||
| if ( $impliedTimezone && preg_match('/[+|-]\d{2}\d{2}/i', $dtValue, $matches) == 0 ) { |
There was a problem hiding this comment.
Do you need to match the colon here as well?
/[+-]\d{2}:?\d{2}/
There was a problem hiding this comment.
This is after the new method normalizeTimezoneOffset() is called, which removes the colon. This isn't explicitly stated in the spec, but I inferred it from: (bold is my emphasis)
However the colons ":" separating the hours and minutes of any timezone offset are optional and discouraged in order to make it less likely that a timezone offset will be confused for a time.
|
@aaronpk Updated.the regexes and TZ normalization. Can you review? |
Fixes #115 and #126