Me again - Weird PHP/XML problem
It's me again. I removed my previous post about my PHP/XML problem to further dive into it and try some more things. I've completely run out of ideas.
I have a PHP page reading an XML file. The XML file has some data with HTML entities, that is, characters represented in the form ü for a ü. When it's being read, inside my character handler function I assign the data to a variable. It was originally assigned to a complex structure but as I simplified the code to track down the problem I noticed it happened no matter what I assign it to.
The character handler looks like this:
Basically, I assign the data to an associative array (does the same thing if I just assign it to a variable, so it's not the assoc. array's fault) and echo it. Displays fine...for instance, if the data is:
<TITLE>Hello Günter!</TITLE>
It will display:
Hello Günter!
Then, once I exit the if-block, it cuts the text off at the last html entity, so the second echo statement would only return:
nter!
This doesn't work if I assign the strings to variables by hand, only when reading from a plaintext XML file. Why does it work inside the if-block and not right afterwards? Isn't it the same data?! The variable is a global so it has nothing to do with scope (it shouldn't at least), and the data gets into the variable as it almost successfully echoes it after the if-block terminates. The same thing happens if I don't put the assignment in an if-block and just echo the data inside the function, and then when I leave the function it no longer displays correctly.
I've been banging my head about this problem for the past few days as it makes absolutely no sense at all. I really don't want to convert the entire file into a tree-like structure each time the page is called...it's a huge (>6MB) file and it takes long enough to parse as it is.
Any ideas? Suggestions? I've tried doing a replacement of the characters with htmlentities, html_entity_encode (or is it decode? whichever it was...), and regular expression search-and-replacements, to no avail. It always cuts it off and starts the text at the last html entity once I exit the if-block.
I have a PHP page reading an XML file. The XML file has some data with HTML entities, that is, characters represented in the form ü for a ü. When it's being read, inside my character handler function I assign the data to a variable. It was originally assigned to a complex structure but as I simplified the code to track down the problem I noticed it happened no matter what I assign it to.
The character handler looks like this:
116 function char_data($parser, $data) {
117 global $stack, $id, $elements;
...
127 if($stack[count($stack) - 1] == 'TITLE') {
141 $elements[$id] = $data;
142 echo $elements[$id]; // this works fine
145 }
146 echo $elements[$id]; // this does NOT
147 }
Basically, I assign the data to an associative array (does the same thing if I just assign it to a variable, so it's not the assoc. array's fault) and echo it. Displays fine...for instance, if the data is:
<TITLE>Hello Günter!</TITLE>
It will display:
Hello Günter!
Then, once I exit the if-block, it cuts the text off at the last html entity, so the second echo statement would only return:
nter!
This doesn't work if I assign the strings to variables by hand, only when reading from a plaintext XML file. Why does it work inside the if-block and not right afterwards? Isn't it the same data?! The variable is a global so it has nothing to do with scope (it shouldn't at least), and the data gets into the variable as it almost successfully echoes it after the if-block terminates. The same thing happens if I don't put the assignment in an if-block and just echo the data inside the function, and then when I leave the function it no longer displays correctly.
I've been banging my head about this problem for the past few days as it makes absolutely no sense at all. I really don't want to convert the entire file into a tree-like structure each time the page is called...it's a huge (>6MB) file and it takes long enough to parse as it is.
Any ideas? Suggestions? I've tried doing a replacement of the characters with htmlentities, html_entity_encode (or is it decode? whichever it was...), and regular expression search-and-replacements, to no avail. It always cuts it off and starts the text at the last html entity once I exit the if-block.
