Conversation
|
Removed utf-8-validate, as not really needed. |
|
Wow, this is already 10% faster than the previous implementation. The main difference here is that we depend on bl. Anybody sees a problem with that? |
|
@heroboy please test. |
There was a problem hiding this comment.
won't this always fire? [] is still thruthy.
There was a problem hiding this comment.
you are right, good spot. Removed.
|
Thanks. It works.
the strings will not be converted to |
Why don't you send a PR, and we compare the perf? |
|
The second line is broken. |
|
|
Please try this one, it's based on StringDecoder, it should be good, and performance is equivalent as the current version. |
|
I think this one is right. for large file(about 3MB) So I think large file bench is better, it make big difference between |
|
what does |
|
Just replace |
|
awesome, are you ok to merge then? |
|
Very ok. |
|
@mafintosh any other opinion on this one? |
There was a problem hiding this comment.
toString() is not needed when we use a string decoder
|
@mcollina other than my minor nitpick comment i'm 👍 |
|
I think |
For future reference: $ git clone --depth 1 git@github.com:mcollina/split2.git |
|
@heroboy I've tried to have a test case where that Also doing |
|
I'm not very clear about the problem. If the input buffer is not a valid utf8 string, the |
|
@mafintosh any opinions on this one? |
|
|
|
@heroboy I've just pushed a test that reproduce the problem I'm speaking about: https://github.com/mcollina/split2/blob/utf-8-support/test.js#L270-L285 This writes a truncated utf-8 char. If that char is not completed, it does not push anything down regarding that char. |
|
But in other node module, if the input utf8 buffer is not completed,it will output the last "incorrect" chars. |
|
Done, let me know @heroboy. |
|
Great. I think there are no problems. And thank you very much make effort to fix this issue. |
Fixes #6.
However, I am not happy with this solution, because it depends on https://www.npmjs.com/package/utf-8-validate.
This is a highly popular module, so possibly we should look for a different solution that is JS-only, possibly the same approach that is used by https://github.com/mafintosh/csv-parser, which is very fast.
@mafintosh @maxogden I might need some of your help here