[getdns-users] bindata string encoding?
Robert Edmonds
edmonds at debian.org
Thu Jul 9 20:15:32 UTC 2015
Melinda Shore wrote:
> On 7/9/15 10:42 AM, Robert Edmonds wrote:
> > I think this is backwards: if you have a byte sequence and an explicit
> > length, this allows for embedded NUL bytes, and you should use the
> > explicit length rather than assuming the byte sequence is a C-style
> > string and truncating it at the first NUL byte (or, worse, performing an
> > out-of-bounds read if it turned out this assumption was incorrect and
> > the sequence didn't contain a NUL byte).
>
> I'm kind of "meh" on that - I'm not sure that it's reasonable to
> assume the possible presence of a 0 byte in the middle of something
> that's agreed to be a C-format string.
Hi, Melinda:
I agree, if a field is defined to be a C-style NUL-terminated string,
then by definition it ends at the first \0 byte and the string cannot
contain embedded NULs. But the version of the spec I'm looking at
(https://getdnsapi.net/spec.html) only says that the 'version_string'
bindata field represents a "string", without specifying how the string
is encoded. My confusion comes about because the string is passed
through an interface (the getdns_bindata type) that also passes an
explicit length.
> There are definitely places we're punting the bounds-checking to the
> Python libraries and that may not be reasonable.
Yeah, AFAICT, the Python binding is just passing the bindata 'data'
field to PyString_FromString(), which then calls strlen() on it. So
there's no bounds-checking at all, it's relying on the 'data' field to
be NUL-terminated. That's why I recommend explicitly relying on the
bounds information that the bindata type provides :-)
--
Robert Edmonds
edmonds at debian.org
More information about the Users
mailing list