Image

Imagedrchase wrote in Imagephp_dev

Me again - Weird PHP/XML problem

It's me again. I removed my previous post about my PHP/XML problem to further dive into it and try some more things. I've completely run out of ideas.


I have a PHP page reading an XML file. The XML file has some data with HTML entities, that is, characters represented in the form ü for a ü. When it's being read, inside my character handler function I assign the data to a variable. It was originally assigned to a complex structure but as I simplified the code to track down the problem I noticed it happened no matter what I assign it to.

The character handler looks like this:
    116    function char_data($parser, $data) {
    117       global $stack, $id, $elements;
        ...
    127       if($stack[count($stack) - 1] == 'TITLE') {
    141          $elements[$id] = $data;
    142          echo $elements[$id]; // this works fine
    145       }
    146       echo $elements[$id]; // this does NOT
    147    } 


Basically, I assign the data to an associative array (does the same thing if I just assign it to a variable, so it's not the assoc. array's fault) and echo it. Displays fine...for instance, if the data is:

<TITLE>Hello G&#252;nter!</TITLE>

It will display:
Hello Günter!

Then, once I exit the if-block, it cuts the text off at the last html entity, so the second echo statement would only return:
nter!

This doesn't work if I assign the strings to variables by hand, only when reading from a plaintext XML file. Why does it work inside the if-block and not right afterwards? Isn't it the same data?! The variable is a global so it has nothing to do with scope (it shouldn't at least), and the data gets into the variable as it almost successfully echoes it after the if-block terminates. The same thing happens if I don't put the assignment in an if-block and just echo the data inside the function, and then when I leave the function it no longer displays correctly.

I've been banging my head about this problem for the past few days as it makes absolutely no sense at all. I really don't want to convert the entire file into a tree-like structure each time the page is called...it's a huge (>6MB) file and it takes long enough to parse as it is.

Any ideas? Suggestions? I've tried doing a replacement of the characters with htmlentities, html_entity_encode (or is it decode? whichever it was...), and regular expression search-and-replacements, to no avail. It always cuts it off and starts the text at the last html entity once I exit the if-block.