Commit 3e7acf9
committed
Remove mbstring identify filters
mbstring had an 'identify filter' for almost every supported text encoding
which was used when auto-detecting the most likely encoding for a string.
It would run over the string and set a 'flag' if it saw anything which
did not appear likely to be the encoding in question.
One problem with this scheme was that encodings which merely appeared
less likely to be the correct one were completely rejected, even if there
was no better candidate. Another problem was that the 'identify filters'
had a huge amount of code duplication with the 'conversion filters'.
Eliminate the identify filters. Instead, when auto-detecting text
encoding, use conversion filters to see whether the input string is valid
in candidate encodings or not. At the same type, watch the type of
codepoints which the string decodes to and mark it as less likely if
non-printable characters (ESC, form feed, bell, etc.) or 'private use
area' codepoints are seen.
Interestingly, one old test case in which JIS text was misidentified
as UTF-8 (and this wrong behavior was enshrined in the test) was 'fixed'
and the JIS string is now auto-detected as JIS.1 parent a416f93 commit 3e7acf9
File tree
121 files changed
+120
-2819
lines changed- ext/mbstring
- libmbfl
- filters
- mbfl
- tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
121 files changed
+120
-2819
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | | - | |
165 | 164 | | |
166 | 165 | | |
167 | 166 | | |
| |||
177 | 176 | | |
178 | 177 | | |
179 | 178 | | |
180 | | - | |
| 179 | + | |
181 | 180 | | |
182 | 181 | | |
183 | 182 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
36 | 34 | | |
37 | 35 | | |
38 | 36 | | |
| |||
44 | 42 | | |
45 | 43 | | |
46 | 44 | | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | 45 | | |
54 | 46 | | |
55 | 47 | | |
| |||
88 | 80 | | |
89 | 81 | | |
90 | 82 | | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
38 | 37 | | |
39 | 38 | | |
40 | 39 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
34 | | - | |
35 | 33 | | |
36 | 34 | | |
37 | 35 | | |
| |||
45 | 43 | | |
46 | 44 | | |
47 | 45 | | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | 46 | | |
55 | 47 | | |
56 | 48 | | |
| |||
108 | 100 | | |
109 | 101 | | |
110 | 102 | | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
36 | 34 | | |
37 | 35 | | |
38 | 36 | | |
| |||
46 | 44 | | |
47 | 45 | | |
48 | 46 | | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | 47 | | |
56 | 48 | | |
57 | 49 | | |
| |||
101 | 93 | | |
102 | 94 | | |
103 | 95 | | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
38 | 37 | | |
39 | 38 | | |
40 | 39 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
36 | | - | |
37 | 35 | | |
38 | 36 | | |
39 | 37 | | |
| |||
77 | 75 | | |
78 | 76 | | |
79 | 77 | | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | 78 | | |
93 | 79 | | |
94 | 80 | | |
| |||
322 | 308 | | |
323 | 309 | | |
324 | 310 | | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | | - | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
40 | 39 | | |
41 | | - | |
42 | 40 | | |
43 | 41 | | |
44 | 42 | | |
| |||
0 commit comments