Commit 4427b2e
committed
Mark UTF-8 strings emitted by mbstring functions as valid UTF-8
We now have a couple of mbstring functions which have fast paths for
strings marked as 'valid UTF-8'. Later, we may likely have more. So
that these fast paths can be used more frequently, mark UTF-8 strings
emitted by mbstring as 'valid UTF-8'. This is always a correct thing
to do, because mbstring never returns invalid UTF-8 as the result of
a conversion (or similar) operation.
Internally, we do have a conversion mode which deliberately emits
invalid UTF-8 in some cases. (This is done to prevent unwanted matches
when we are converting strings to UTF-8 before performing matching
operations on them.) For such strings, don't set the 'valid UTF-8' flag.
It probably wouldn't hurt anything to set it, because strings generated
using that special conversion mode should *never* be returned to
userland, and I don't think we do anything with them which cares about
the IS_STR_VALID_UTF8 flag... but still, it would likely cause
confusion for developers.1 parent e7c0f4e commit 4427b2e
File tree
5 files changed
+32
-12
lines changed- ext/mbstring
- libmbfl/mbfl
5 files changed
+32
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
50 | 56 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
368 | | - | |
| 368 | + | |
369 | 369 | | |
370 | 370 | | |
371 | 371 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
208 | 209 | | |
209 | 210 | | |
210 | 211 | | |
211 | | - | |
| 212 | + | |
212 | 213 | | |
213 | 214 | | |
214 | 215 | | |
| |||
234 | 235 | | |
235 | 236 | | |
236 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
237 | 249 | | |
238 | 250 | | |
239 | 251 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1591 | 1591 | | |
1592 | 1592 | | |
1593 | 1593 | | |
1594 | | - | |
| 1594 | + | |
1595 | 1595 | | |
1596 | 1596 | | |
1597 | 1597 | | |
| |||
1679 | 1679 | | |
1680 | 1680 | | |
1681 | 1681 | | |
1682 | | - | |
| 1682 | + | |
1683 | 1683 | | |
1684 | 1684 | | |
1685 | 1685 | | |
| |||
1696 | 1696 | | |
1697 | 1697 | | |
1698 | 1698 | | |
1699 | | - | |
| 1699 | + | |
1700 | 1700 | | |
1701 | 1701 | | |
1702 | 1702 | | |
| |||
1710 | 1710 | | |
1711 | 1711 | | |
1712 | 1712 | | |
1713 | | - | |
| 1713 | + | |
1714 | 1714 | | |
1715 | 1715 | | |
1716 | 1716 | | |
| |||
2076 | 2076 | | |
2077 | 2077 | | |
2078 | 2078 | | |
2079 | | - | |
| 2079 | + | |
2080 | 2080 | | |
2081 | 2081 | | |
2082 | 2082 | | |
| |||
2590 | 2590 | | |
2591 | 2591 | | |
2592 | 2592 | | |
2593 | | - | |
| 2593 | + | |
| 2594 | + | |
| 2595 | + | |
2594 | 2596 | | |
2595 | 2597 | | |
2596 | 2598 | | |
| |||
3298 | 3300 | | |
3299 | 3301 | | |
3300 | 3302 | | |
3301 | | - | |
| 3303 | + | |
3302 | 3304 | | |
3303 | 3305 | | |
3304 | 3306 | | |
| |||
3697 | 3699 | | |
3698 | 3700 | | |
3699 | 3701 | | |
3700 | | - | |
| 3702 | + | |
3701 | 3703 | | |
3702 | 3704 | | |
3703 | 3705 | | |
| |||
3929 | 3931 | | |
3930 | 3932 | | |
3931 | 3933 | | |
3932 | | - | |
| 3934 | + | |
3933 | 3935 | | |
3934 | 3936 | | |
3935 | 3937 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
369 | | - | |
| 369 | + | |
370 | 370 | | |
0 commit comments