@Lea, I thought about whether or not it’d be achievable using regex lookaheads/behinds but I figured, even if JavaScript supported lookbehinds, it would be quite slow.
@Rick, you right; it would be tripped up by that, but it’s not valid JavaScript so I’m not bothered. "\"" would throw a syntax error (since the first is escaping the second there is nothing left to escape the second ").
@Vasco, It seems so; I read more about this technique over here: http://www.codeproject.com/KB/cs/jscompress.aspx – they seem to be using a similar method to remove comments. It’s funny how something that seems so simple can end up being quite complicated…
]]>I may be wrong, but it looks like it could still be tripped up with double backslashes… Eg. “a\”” /*Boo!*/
]]>I’ve stumbled across it several times in the past, the last two while writing a syntax highlighter and the latest while writing a small parser for google-style search queries. In the first case (the syntax highlighter), I decided it’s not worth the extra resources for such edge cases, since it was mainly for my personal use and I wanted something fast, even if I had to sacrifice 100% correctness.
In the second, more recent case, I decided to take a similar approach as you did, since such cases were going to be really common and it’s also a commercial project, so I can’t have search queries failing due to lazy parsing.
I’m also really interested if there’s a better solution. I think perhaps it would be somehow possible by combining regex lookahead and lookbehind (in languages that support them, JS doesn’t support lookbehind š ) but I’m not very experienced with those two.
]]>