[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC#106358
[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC#106358elinor-fung merged 4 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @tommcdon |
| while (true) | ||
| { | ||
| // Read characters until we find the null terminator | ||
| char nameChar = _target.Read<char>(addr); | ||
| if (nameChar == 0) | ||
| break; | ||
|
|
||
| name.Add(nameChar); | ||
| addr += sizeof(char); | ||
| } |
There was a problem hiding this comment.
I would consider doing this slightly different. I would instead determine the entire length, find the null terminator and then reread the entire block of bytes in a single operation. I think it would be slightly easier to debug. I also don't know if endianness is important here. Metadata is always encoded as UTF-16LE, but I'm not sure about non-metadata allocated strings.
There was a problem hiding this comment.
non-metadata allocated strings would be big endian on a big endian machine
There was a problem hiding this comment.
I think this implementation is ... fine... but it would be better if we added an api to Target to read null terminated strings of both 16bit and 8 bit char types in target endianness. This is a thing which keeps coming up, and we shouldn't be re-implementing the wheel again and again.
There was a problem hiding this comment.
non-metadata allocated strings would be big endian on a big endian machine
Would they even be utf-16 at all?
(To be clear, I think this is kind of academic until someone sits down and does a coreclr port to POWER64 and actually makes some decision about what an LPCWSTR even is in that platform version of the CoreCLR PAL. I don't know what a reasonable answer might be. /cc @uweigand @directhex )
Do we want to stash an encoding somewhere in the runtime descriptor?
There was a problem hiding this comment.
but it would be better if we added an api to Target to read null terminated strings of both 16bit and 8 bit char types in target endianness.
It is more complicated than that. We need to know if the target pointer is into metadata, which is always LE. I was going to suggest the same thing, but the metadata angle makes it less obvious the correct answer.
There was a problem hiding this comment.
Even with the metadata angle, I think having APIs on Target reading strings in target endianness still makes sense.
We need to know if the target pointer is into metadata, which is always LE.
And then it is up to the caller / consumer of the target pointer to know that it shouldn't be read in target endianness and to read it their own way?
There was a problem hiding this comment.
I think having APIs on Target reading strings in target endianness still makes sense.
Sure, I agree with this. My point is, it isn't as simple as "read the string". In fact it can become very complicated if the API doesn't know the encoding. The API needs to know the encoding apriori or else it won't be able to detect null properly. Is it the "high order" byte of a UTF-16 or an actual null in UTF-8? The string reading API needs to know. The endianness for UTF-16 is easier, but still can be confusing to the caller. The API said it read UTF-16, but now the caller must consider the source and determine if it is in machine endian form or metadta endian form. Creating three APIs would be needed.
It would need to be something like:
string ReadUTF8String();
string ReadUTF16TargetEndianness();
string ReadUTF16MetadataEndianness();There was a problem hiding this comment.
I think @AaronRobinsonMSFT's apis here are good enough for now. As @lambdageek points out, dealing with actual big endian machines is quite unlikely to happen in the near term, and we can get away with having a "reasonable" model now, and tweak it to match reality if/when such a thing happens. I also don't think this refactor to using a Target level api should happen in this PR, but should be deferred to a follow-up PR which is focused on making the string handling sensible.
There was a problem hiding this comment.
Yeah, I was expecting multiple APIs - like what @AaronRobinsonMSFT wrote. I put in a TODO for adding/using Target APIs and will a follow-up change for that.
GetPathinLoadercontractContributes to #99302