Comment 19 for bug 507416

Revision history for this message
Dave Martin (dave-martin-arm) wrote : RE: [Bug 507416] Re: CONFIG_NEON=y causes platform lockups with certainapplication/platform combinations

> If I understand correctly, I think our goal is to make one
> kernel for all the imx51 silicon. On the good silicon, we

I think this is the right approach--- we want to avoid fragmentation.

> enable all the NEON and VFP. On the bad silicon, we disable
> the NEON and add some hack in unaligned access code.
>
> So we are going to do following things in kernel,
> 0, turn on CONFIG_NEON=y
> 1. dynamic detect the silicon rev
> 2. if rev < 3, we disable NEON by setting ASEDIS
> 3. if rev < 3, we add some hack in alignment_init() of
arch/arm/mm/alignment.c.
> 4. my patch -- 'remove NEON flag of HWCAP dynamically' -- is still helpful
for user space to run non-NEON version code.
>
> But, I just think if we add some hack in alignment_init to
> handle all the unaligned access for both NEON or none NEON,
> we don't need to disable NEON at all, since this is an
> unaligned access issue not a specific NEON one.

I _think_ you are correct on this point, but I haven't tried it; also, the
ASEDIS fix is much easier than enabling full alignment faulting (which will
require new code which may more than double the size of
do_alignment_t32_to_handler() to translate the additional instructions).

Full alignment faulting will have a performance impact, but I don't know how
much without trying it, and Ubuntu is unlikely even to boot properly if the
necessary extensions in of do_alignment_t32_to_handler() are missing. The
impact could be severe if there are a lot of memcpy() of unaligned buffers
etc., though this may not be the case (or memcpy() might be clever enough to
avoid it).

However, if we do have full alignment faulting and leave ASEDIS clear, we
may have the nice side-effect that aligned NEON code will be able to
execute, whilst being able to nuke unaligned use with SIGILL.

I'll try and feed back on what amount and complexity of extra code might be
needed.