improve 2x-3x sha256sum performance on ppc64le due to current gcc optimization bug

Bug #1667407 reported by Gustavo Serra Scalet
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
coreutils (Debian)
Fix Released
Unknown
coreutils (Ubuntu)
Fix Released
Undecided
Matthias Klose
Xenial
Fix Released
Undecided
Unassigned
Yakkety
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

 * Performance drop of 2x-3x on ppc64le when using sha256sum

 * Please backport this bug to last LTS as this application is widely used.

 * This fix is due to a optimization issue found on gcc v4.9 to gcc v7.0.

[Test Case]

 * Run sha256sum with a big file and measure the time it takes. The patch improves this timing up to 3 times faster.

[Regression Potential]

 * This patch is specifically for the sha256.o object, affecting only this binary.

[Other Info]

Michael Stone's improved patch:

Index: coreutils-8.26/Makefile.in
===================================================================
--- coreutils-8.26.orig/Makefile.in 2016-11-30 13:34:55.000000000 -0500
+++ coreutils-8.26/Makefile.in 2017-02-22 07:18:55.352394058 -0500
@@ -14661,6 +14661,10 @@

$(TEST_LOGS): $(PROGRAMS)

+ifeq ($(DEB_TARGET_ARCH), ppc64el)
+lib/sha256.o: CFLAGS+=-fno-schedule-insns
+endif
+
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

Original bug description:

The sha256sum provided by coreutils (without openssl) is performing
poorly with gcc versions >= 4.9 until 7.0 (currently under development).
The reason for that is the -fschedule-insns optimization that is used
with -O2. By simply deactivating it, there is a performance improvement
of 2 to 3 times.

I'm using Ubuntu 16.10 and the coreutils package version 8.25-2ubuntu2.

Please check the following closed debian bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854053

Be aware of the following conditions that are required:
* If ppc64le
* If gcc being used is >= 4.9 and < 7.0

Notes:
1) gcc-7 is not affected by this bug (verified on 20170129 snapshot).
2) clang is not affected by this bug (verified on v3.8 and v3.9).
3) strangely the sha512 is not affected by this.

Below a demonstration of how it performs on my POWER8 machine:

===================================================
$ (./configure && make -j9) > /dev/null && time src/sha256sum ~/ubuntu-16.10-server-ppc64el.iso
configure: WARNING: libacl development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without ACL support.
configure: WARNING: libattr development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without xattr support.
configure: WARNING: libcap library was not found or not usable.
configure: WARNING: GNU coreutils will be built without capability support.
configure: WARNING: libgmp development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without GMP support.
src/who.c: In function 'print_user':
src/who.c:454:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
         int *a = utmp_ent->ut_addr_v6;
                    ^~~~~~~~
d14bdb413ea6cdc8d9354fcbc37a834b7de0c23f992deb0c6764d0fd5d65408e /home/gut/ubuntu-16.10-server-ppc64el.iso

real 0m18.670s
user 0m16.566s
sys 0m0.745s

$ # now with the following patch: ## Check Michael Stone's patch for an improved version.
$ diff Makefile.in ../Makefile.in
8989c8989
< @am__fastdepCC_TRUE@ $(COMPILE) -MT $@ -MD -MP -MF $$depbase.Tpo -c -o $@ $< &&\
---
> @am__fastdepCC_TRUE@ $(COMPILE) $$([ "$@" == "lib/sha256.o" ] && echo "-fno-schedule-insns") -MT $@ -MD -MP -MF $$depbase.Tpo -c -o $@ $< &&\
$ cp ../Makefile.in Makefile.in
$ (./configure && make -j9) > /dev/null && time src/sha256sum ~/ubuntu-16.10-server-ppc64el.iso
configure: WARNING: libacl development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without ACL support.
configure: WARNING: libattr development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without xattr support.
configure: WARNING: libcap library was not found or not usable.
configure: WARNING: GNU coreutils will be built without capability support.
configure: WARNING: libgmp development library was not found or not usable.
configure: WARNING: GNU coreutils will be built without GMP support.
src/who.c: In function 'print_user':
src/who.c:454:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
         int *a = utmp_ent->ut_addr_v6;
                    ^~~~~~~~
d14bdb413ea6cdc8d9354fcbc37a834b7de0c23f992deb0c6764d0fd5d65408e /home/gut/ubuntu-16.10-server-ppc64el.iso

real 0m5.903s
user 0m5.560s
sys 0m0.255s

tags: added: yakkety
Changed in coreutils (Ubuntu):
assignee: nobody → Matthias Klose (doko)
Revision history for this message
Gustavo Serra Scalet (gut) wrote :

One other possibility (as performed by Fedora) is to have libcrypto (from openssl) configured through the "--with-openssl=yes" parameter.

Changed in coreutils (Debian):
status: Unknown → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

fixed in zesty with 8.26-3ubuntu1

Changed in coreutils (Ubuntu):
status: New → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Could we get the bug details updated to follow the SRU template?

Also, checking the SRU upload, one thing I don't particularly like is that the attached patch has no description. Yes, the change is self-explanatory, but some context in patch descriptions is always welcome. Won't block on that though as I see some of the existing patches in the package have empty 'Description:' fields anyway...

description: updated
Revision history for this message
Gustavo Serra Scalet (gut) wrote : RE: [Bug 1667407] Re: improve 2x-3x sha256sum performance on ppc64le due to current gcc optimization bug

> -----Original Message-----
> From: <email address hidden> [mailto:<email address hidden>] On Behalf Of
> Lukasz Zemczak
> Sent: quinta-feira, 2 de março de 2017 10:31
> To: Gustavo Serra Scalet <email address hidden>
> Subject: [Bug 1667407] Re: improve 2x-3x sha256sum performance on
> ppc64le due to current gcc optimization bug
>
> Could we get the bug details updated to follow the SRU template?

Done

> Also, checking the SRU upload, one thing I don't particularly like is
> that the attached patch has no description. Yes, the change is self-
> explanatory, but some context in patch descriptions is always welcome.
> Won't block on that though as I see some of the existing patches in the
> package have empty 'Description:' fields anyway...

Then I'd add to Michael Stone's patch:

diff --git i/Makefile.in w/Makefile.in
index bdf9a43..7a43897 100644
--- i/Makefile.in
+++ w/Makefile.in
@@ -14434,6 +14434,13 @@ $(factor_tests): $(tf)/run.sh $(tf)/create-test.sh

 $(TEST_LOGS): $(PROGRAMS)

+# REMOVE THIS HANDLING FOR PPC64 IF GCC > 7.0 IS USED
+# Shutting off -fschedule-insns optimization due to huge performance drop
+# (around 2 to 3 times).
+ifeq ($(DEB_TARGET_ARCH), ppc64el)
+lib/sha256.o: CFLAGS+=-fno-schedule-insns
+endif
+
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Gustavo, or anyone else affected,

Accepted coreutils into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/coreutils/8.25-2ubuntu3~16.10 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in coreutils (Ubuntu Yakkety):
status: New → Fix Committed
tags: added: verification-needed
Changed in coreutils (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Gustavo, or anyone else affected,

Accepted coreutils into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/coreutils/8.25-2ubuntu3~16.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Gustavo Serra Scalet (gut) wrote :

Fix verified on coreutils_8.25-2ubuntu3~16.10_ppc64el.deb

Revision history for this message
Gustavo Serra Scalet (gut) wrote :

just to give an idea, to verify the sha256 of ubuntu-16.10-server-ppc64el.iso:

8.25-2ubuntu2 : 0m15.204s
8.25-2ubuntu3~16.10: 0m4.952s

great!

tags: added: verification-done
removed: verification-needed
Revision history for this message
Gustavo Serra Scalet (gut) wrote :

Hi, is there anything else I can do in order to have the 8.25-2ubuntu3 on yakkety? (not on proposed)

Revision history for this message
Robie Basak (racb) wrote :

Thank you for testing Yakkety. I can release 8.25-2ubuntu3~16.10 to yakkety-updates. 8.25-2ubuntu3~16.04 still needs testing from xenial-updates though.

tags: added: verification-done-yakkety
removed: verification-done
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package coreutils - 8.25-2ubuntu3~16.10

---------------
coreutils (8.25-2ubuntu3~16.10) yakkety-proposed; urgency=medium

  * SRU: LP: #1667407. Add -fno-schedule-insns to CFLAGS for sha256.o
    on ppc64el to fix performance regression.

 -- Matthias Klose <email address hidden> Tue, 28 Feb 2017 10:18:46 +0100

Changed in coreutils (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Update Released

The verification of the Stable Release Update for coreutils has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote : [coreutils/xenial] verification still needed

The fix for this bug has been awaiting testing feedback in the -proposed repository for xenial for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now, the package will be removed from the -proposed repository.

tags: added: removal-candidate
Revision history for this message
Gustavo Serra Scalet (gut) wrote :

Tested on xenial (16.04).

Test input of 465MB took 4.5s (version 8.25-2ubuntu3~16.04) instead of 15.1s (version 8.25-2ubuntu2).

Thanks

tags: added: verification-done-xenial
removed: removal-candidate
Revision history for this message
Gustavo Serra Scalet (gut) wrote :

Is that all? Please point out if I missed something.

Matthias Klose (doko)
tags: added: verification-done
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote : Reminder of SRU verification policy change

Thank you for taking the time to verify this stable release fix. We have noticed that you have used the verification-done tag for marking the bug as verified and would like to point out that due to a recent change in SRU bug verification policy fixes now have to be marked with per-release tags (i.e. verification-done-$RELEASE). Please remove the verification-done tag and add one for the release you have tested the package in. Thank you!

https://wiki.ubuntu.com/StableReleaseUpdates#Verification

Revision history for this message
Gustavo Serra Scalet (gut) wrote :

@doko I already added the verification-done-xenial tag so I guess that's all that's needed to do, right?

tags: removed: verification-done
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package coreutils - 8.25-2ubuntu3~16.04

---------------
coreutils (8.25-2ubuntu3~16.04) xenial-proposed; urgency=medium

  * SRU: LP: #1667407. Add -fno-schedule-insns to CFLAGS for sha256.o
    on ppc64el to fix performance regression.

 -- Matthias Klose <email address hidden> Tue, 28 Feb 2017 10:18:46 +0100

Changed in coreutils (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.