Comment 5 for bug 273189

Revision history for this message
Colin Watson (cjwatson) wrote : Re: non-ascii layout/encoding problems at "login" line

OK, I've tracked this down to a combination of a bug in getty, which is shipped by util-linux, and a bug in the way we start getty. Here's the relevant code in get_logname:

            if (op->eightbits) {
                ascval = c;
            } else if (c != (ascval = (c & 0177))) { /* "parity" bit on */
                for (bits = 1, mask = 1; mask & 0177; mask <<= 1)
                    if (mask & ascval)
                        bits++; /* count "1" bits */
                cp->parity |= ((bits & 1) ? 1 : 2);
            }
            /* Do erase, kill and end-of-line processing. */

            switch (ascval) {
            case CR:
            case NL:
                *bp = 0; /* terminate logname */
                cp->eol = ascval; /* set end-of-line char */
                break;
            case BS:
            case DEL:
            case '#':
                cp->erase = ascval; /* set erase character */
                if (bp > logname) {
                    (void) write(1, erase[cp->parity], 3);
                    bp--;
                }
                break;
            case CTL('U'):
            case '@':
                cp->kill = ascval; /* set kill character */
                while (bp > logname) {
                    (void) write(1, erase[cp->parity], 3);
                    bp--;
                }
                break;
            case CTL('D'):
                exit(0);
            default:
                if (!isascii(ascval) || !isprint(ascval)) {
                     /* ignore garbage characters */ ;
                } else if (bp - logname >= sizeof(logname) - 1) {
                    error(_("%s: input overrun"), op->tty);
                } else {
                    (void) write(1, &c, 1); /* echo the character */
                    *bp++ = ascval; /* and store it */
                }
                break;
            }

There are two main problems here. Firstly, we aren't running getty in eight-bit-clean mode on the Linux console, and I think we should be. This is the responsibility of system-services; finish-install will then have to be careful to undo this for serial consoles. Outside eight-bit-clean mode, getty interprets the second byte of my test character (the £ sign) as # with the parity bit set, and thus emits an erase sequence (which gets broken because it's using the wrong parity, I think ...).

Secondly, even in eight-bit-clean mode, getty rejects anything that isn't matched by isascii() or isprint(); it's doing this in the C locale so characters such as ğ and ş aren't counted as printable. I'm not sure what the correct resolution for this is, but it would be nice to permit non-ASCII usernames. However, this will likely require fixes elsewhere (for example, the installer rejects usernames that don't match /^[a-z][-a-z0-9]*$/, and I imagine that most software that processes usernames makes no attempt to convert them between encodings), so I'm marking this part of the bug as wishlist. If the first part of this bug is fixed, then at least typing a non-ASCII character won't corrupt the console.