Comment 21 for bug 3918

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 3918] Re: bzr can be caused to error with filenames containing newlines

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> I agree, it would be good to be able to accept these paths. I was a bit
> surprised to realise that \n (at least) in a filename still does not
> work in a 2a format repository. Obviously 2a will be the default format
> for quite some time now, but it would be interesting to see how hard it
> is to allow all bytes in filenames in the next development format. My
> guess is that it's not actually very hard now that we no longer use XML
> for inventories.
>

CHK pages are still split on '\n', IIRC. So in the intermediate term it
is quite difficult.

I think an individual row is split on '\0' and some of the file-ids, etc
are split on some random chars like '\r'.

If we changed the byte storage to not be delimited, but instead be
length-prefixed, then all of this would go away. Or require strictly
'\0' delimited, though that gets a bit hairy with nesting and arbitrary
width stuff.

If I think hard about it, we probably could unambiguously parse a CHK
page that had '\n' in the filenames. But it would require rewriting the
parser to not start with:

 lines = bytes.split('\n')

:)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrCHFYACgkQJdeBCYSNAAPWmgCgr4YlPx52dDxZtcbwcv7PzDSp
aUIAoI6DHk63m7/3pbJG9Mwu90PNJkyN
=symT
-----END PGP SIGNATURE-----