[SRU] euca-get-console-output gives first 64k of output, not most recent

Bug #566793 reported by Scott Moser
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Eucalyptus
Fix Committed
Undecided
chris grzegorczyk
eucalyptus (Ubuntu)
Fix Released
Medium
Dave Walker
Lucid
Won't Fix
Medium
Dave Walker
Maverick
Fix Released
Medium
Dave Walker

Bug Description

euca-get-console-output does not provide all of a console's output.

To reproduce this behavior, run an instance that executes user-data as a script (ie, the uec-images), like this:

$ cat user-data-loop.sh
#!/bin/bash

max=5000;
sleep_often=50
sleep=.2
for((i=0;i<${max};i++)); do
        echo "hello, number ${i}"
        [ $((${i}%${sleep_often})) -eq 0 ] && sleep ${sleep}
done
$ euca-run-instances --user-data-file=user-data-loop.sh --key mykey \
   --instance-type=m1.small emi-abcdefgh
# wait
$ time euca-get-console-output i-4537075F | tail -n 5
hello, number 2454
hello, number 2455
hello, number 2456
hello, number 2457
hello, number 2458

real 5m9.557s
user 0m0.130s
sys 0m0.050s

If I ssh to the node controller, and look at /var/lib/eucalyptus/instances/admin/i-4537075F/console.log I see that the count went all the way to 4999.

Note, I've waited minutes after the VM wrote the data before getting console output, it doesn't seem like just a (resonable) caching issue.

Also note, that the get-console-output takes an extremely long amount of time (minutes).

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: eucalyptus-common 1.6.2-0ubuntu30
ProcVersionSignature: Ubuntu 2.6.32-21.31-server 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-server x86_64
.etc.eucalyptus.eucalyptus.cc.conf: CC_NAME="cluster1"
Architecture: amd64
Date: Mon Apr 19 11:47:55 2010
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: eucalyptus

======
IMPACT:
 * This bug affects mostly people trying to debug a UEC instance which is not behaving (or starting) properly. If verbose logging is turned on on something like upstart at init, the 64k limit in eucalyptus' euca-get-console buffer will easily overflow.

ADDRESSED:
 * This bug is addressed by cherry-picking a patch from the upstream stable release branch.

REPRODUCE:
 * To reproduce this issue, turn on verbose upstart logging in an image, register it, and run it. Then, run euca-get-console against the instance id. You should on see the first 64K of data.

REGRESSION POTENTIAL:
 * The chances for regression are relatively low. Perhaps the console output not not be correct, or might be less.
======

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I just dumped
$ euca-get-console-output i-4537075F > out
$ ls -lh out
-rw-r--r-- 1 ubuntu ubuntu 65K 2010-04-19 12:04 out

then, trimmed 'out' of the the single empty line at the end, and the iid/timestamp header:
i-4537075F
2010-04-19T17:04:44.446Z

$ ls -l out
-rw-r--r-- 1 ubuntu ubuntu 65535 2010-04-19 12:06 out

65535 sounds like a limited buffer.

Revision history for this message
chris grzegorczyk (chris-grze) wrote : Re: [Bug 566793] Re: euca-get-console-output does not give all console output

@Scott: The command is supposed to return the last 64K of console
output. Your results are expected in that case. See the first note
from: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/index.html?instance-console.html

As for the time, it should be timing out after 20 seconds -- are you
not seeing that behaviour?

On Mon, Apr 19, 2010 at 10:06 AM, Scott Moser <email address hidden> wrote:
> I just dumped
> $ euca-get-console-output i-4537075F > out
> $ ls -lh out
> -rw-r--r-- 1 ubuntu ubuntu 65K 2010-04-19 12:04 out
>
> then, trimmed 'out' of the the single empty line at the end, and the iid/timestamp header:
> i-4537075F
> 2010-04-19T17:04:44.446Z
>
> $ ls -l out
> -rw-r--r-- 1 ubuntu ubuntu 65535 2010-04-19 12:06 out
>
> 65535 sounds like a limited buffer.
>
> --
> euca-get-console-output does not give all console output
> https://bugs.launchpad.net/bugs/566793
> You received this bug notification because you are subscribed to
> eucalyptus in ubuntu.
>
> Status in “eucalyptus” package in Ubuntu: New
>
> Bug description:
> euca-get-console-output does not provide all of a console's output.
>
> To reproduce this behavior, run an instance that executes user-data as a script (ie, the uec-images), like this:
>
> $ cat user-data-loop.sh
> #!/bin/bash
>
> max=5000;
> sleep_often=50
> sleep=.2
> for((i=0;i<${max};i++)); do
>        echo "hello, number ${i}"
>        [ $((${i}%${sleep_often})) -eq 0 ] && sleep ${sleep}
> done
> $ euca-run-instances --user-data-file=user-data-loop.sh --key mykey \
>   --instance-type=m1.small emi-abcdefgh
> # wait
> $ time euca-get-console-output i-4537075F | tail -n 5
> hello, number 2454
> hello, number 2455
> hello, number 2456
> hello, number 2457
> hello, number 2458
>
> real    5m9.557s
> user    0m0.130s
> sys     0m0.050s
>
> If I ssh to the node controller, and look at /var/lib/eucalyptus/instances/admin/i-4537075F/console.log I see that the count went all the way to 4999.
>
> Note, I've waited minutes after the VM wrote the data before getting console output, it doesn't seem like just a (resonable) caching issue.
>
> Also note, that the get-console-output takes an extremely long amount of time (minutes).
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: eucalyptus-common 1.6.2-0ubuntu30
> ProcVersionSignature: Ubuntu 2.6.32-21.31-server 2.6.32.11+drm33.2
> Uname: Linux 2.6.32-21-server x86_64
> .etc.eucalyptus.eucalyptus.cc.conf: CC_NAME="cluster1"
> Architecture: amd64
> Date: Mon Apr 19 11:47:55 2010
> ProcEnviron:
>  LANG=en_US.UTF-8
>  SHELL=/bin/bash
> SourcePackage: eucalyptus
>
>
>

Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 566793] Re: euca-get-console-output does not give all console output

On Mon, 19 Apr 2010, chris grzegorczyk wrote:

> @Scott: The command is supposed to return the last 64K of console
> output. Your results are expected in that case. See the first note
> from: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/index.html?instance-console.html

Well, then it should return the *last* 64k, not the first 64k. I'm
getting the first 64k.

> As for the time, it should be timing out after 20 seconds -- are you
> not seeing that behaviour?

well, that is actual copy and paste. the euca-get-console-output command
takes multiple minutes in many cases.

Scott Moser (smoser)
summary: - euca-get-console-output does not give all console output
+ euca-get-console-output 64k of output, not most recent 64k
summary: - euca-get-console-output 64k of output, not most recent 64k
+ euca-get-console-output gives first 64k of output, not most recent
Mathias Gug (mathiaz)
Changed in eucalyptus (Ubuntu):
importance: Undecided → Low
Revision history for this message
Scott Moser (smoser) wrote : Re: euca-get-console-output gives first 64k of output, not most recent

I'm increasing the importance of this. The reason is that after a reboot, the 64k is almost certainly filled, even if nothing other than kernel messages have been written there. Thus, anything written after that first reboot wont be seen by the user.

Changed in eucalyptus (Ubuntu):
importance: Low → Medium
Changed in eucalyptus (Ubuntu):
assignee: nobody → Daniel Nurmi (nurmi)
status: New → In Progress
Revision history for this message
Daniel Nurmi (nurmi) wrote :

The kvm driver, in the NC, was sending the first 64k of output. Fix is in revno 1224, which adds some lseek() logic to send back the last 64k of console output when getConsoleOutput() is invoked.

Changed in eucalyptus:
status: New → Fix Committed
Changed in eucalyptus (Ubuntu Lucid):
assignee: Daniel Nurmi (nurmi) → Dustin Kirkland (kirkland)
Changed in eucalyptus (Ubuntu Lucid):
milestone: none → lucid-updates
summary: - euca-get-console-output gives first 64k of output, not most recent
+ [SRU] euca-get-console-output gives first 64k of output, not most recent
description: updated
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted eucalyptus into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in eucalyptus (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Martin Pitt (pitti) wrote :

Copied lucid-proposed to maverick.

Changed in eucalyptus (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2-0ubuntu30.1

---------------
eucalyptus (1.6.2-0ubuntu30.1) lucid-proposed; urgency=low

  Address LP: #565101
  * debian/eucalyptus.conf: set default JVM_MEM option
  * debian/eucalyptus-common.eucalyptus.upstart: use $JVM_MEM
    from eucalyptus.conf, or default to 512m
  * tools/eucalyptus.conf.5: document the JVM_MEM option

  Cherry-pick upstream commit r1223..1227:
  * node/handlers.c, node/handlers_kvm.c: handle situation where NC's
    do not detach pthreads, LP: #567371
  * node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/StorageManager.java,
    clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/fs/FileSystemStorageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusImageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusManager.java,
    clc/modules/wsstack/src/main/java/com/eucalyptus/ws/handlers/ServiceSinkHandler.java:
    - fix Walrus OOM errors (java heap), LP: #565101
 -- Dustin Kirkland <email address hidden> Wed, 28 Apr 2010 08:43:38 -0500

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Martin Pitt (pitti)
Changed in eucalyptus (Ubuntu Lucid):
status: Fix Released → Fix Committed
Scott Moser (smoser)
Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Scott-

Actually, you need to install the package from lucid-proposed, test that it fixes the bug, note that here, and then Martin will move the package from lucid-proposed to lucid-updates.

Scott Moser (smoser)
Changed in eucalyptus (Ubuntu Lucid):
status: Fix Released → Fix Committed
Revision history for this message
Scott Moser (smoser) wrote :

I verified the initial test case (with user data while loop on)
$ dpkg-query --show eucalyptus*
eucalyptus-cc 1.6.2-0ubuntu30.1
eucalyptus-cloud 1.6.2-0ubuntu30.1
eucalyptus-common 1.6.2-0ubuntu30.1
eucalyptus-gl 1.6.2-0ubuntu30.1
eucalyptus-java-common 1.6.2-0ubuntu30.1
eucalyptus-javadeps
eucalyptus-sc 1.6.2-0ubuntu30.1
eucalyptus-walrus 1.6.2-0ubuntu30.1

After several minutes of waiting i went back to get console output. I saw:
=====
hello, number 4997
hello, number 4998
hello, number 4999
run-parts: /var/lib/cloud/data/scripts/part-000 exited with return code 1
Traceback (most recent call last):
  File "/usr/bin/cloud-init-run-module", line 66, in <module>
    main()
  File "/usr/bin/cloud-init-run-module", line 61, in main
    cloud.sem_and_run(semname, freq, inst.run, [run_args,cfg_path], False)
  File "/usr/lib/python2.6/dist-packages/cloudinit/__init__.py", line 220, in sem_and_run
    func(*args)
  File "/usr/lib/python2.6/dist-packages/cloudinit/execute.py", line 28, in run
    raise Exception(str)
Exception: Cmd returned 1
run-parts --regex .* /var/lib/cloud/data/scripts
=====

(interestingly, that is correct output. the python trace results from the script being run and returning non-true).

Revision history for this message
Scott Moser (smoser) wrote :

I think i have to take back my "it worked" statement. the first 'get-console-output' worked fine.

The second has been running for 6 minutes, 30 seconds and has printed out the following:
-------------------------
         4 0 8
path=/services/Eucalyptus/?AWSAccessKeyId=WKy3rMzOWPouVOxK1p3Ar1C2uRBwa2FBXnCw&Action=GetConsoleOutput&InstanceId.1=i-431B08E0&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2010-05-28T14%3A56%3A43&Version=2009-11-30&Signature=oghj8LM8UBrtSz8NP47ZhX2uxDymt0swn1XC1sZ%2Bu0c%3D
Failure: 408 Request Timeout

Top shows eucalyptus-cloud eating all of 1 cpu.

cloud-output.log has things like this:
10:03:13 DEBUG ConsoleOutputCallback | ConsoleOutputCallback.java:110 org.jboss.netty.handler.timeout.ReadTimeoutException
10:03:13 DEBUG NioResponseHandler | NioResponseHandler.java:121 org.jboss.netty.handler.timeout.ReadTimeoutException

Revision history for this message
Scott Moser (smoser) wrote :

Something is definitely broken.
After the above eventually returned (17 minutes, a failed command and more timeouts like above on the screen), I restarted eucalyptus and tried again
$ sudo restart eucalyptus
# wait
$ sh -c 'date; time euca-get-console-output i-431B08E0; date' 2>&1 | tee out.first
$ tail -n 3 out.first
run-parts --regex .* /var/lib/cloud/data/scripts
0.15user 0.03system 0:00.47elapsed 38%CPU (0avgtext+0avgdata 55040maxresident)k
8inputs+0outputs (0major+8338minor)pagefaults 0swaps
Fri May 28 10:17:49 CDT 2010

# no try again
$ sh -c 'date; time euca-get-console-output i-431B08E0; date' 2>&1 | tee out.second

That showed the same issue as above.

Revision history for this message
Mathias Gug (mathiaz) wrote :

Marking this bug verification-failed given Scott last comments and the fact that the next eucalyptus SRU (-30.2) reverts this patch.

tags: added: verification-failed
removed: verification-needed
Revision history for this message
Steve Langasek (vorlon) wrote :

Accepted into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: removed: verification-failed
tags: added: verification-needed
Revision history for this message
C de-Avillez (hggdh2) wrote :

Confirmed fixed (or reverted, I guess). All euca-get-console-output commands issued on a test run worked correctly.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2-0ubuntu30.2

---------------
eucalyptus (1.6.2-0ubuntu30.2) lucid-proposed; urgency=low

  * Revert: node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/server/EucalyptusWebBackendImpl.java:
    - fix user enumeration and account brute force, LP: #579942
  * debian/eucalyptus-sc.upstart: Bump maximum number of loop devices for
    SC to 512, LP: #586134

eucalyptus (1.6.2-0ubuntu30.1) lucid-proposed; urgency=low

  Address LP: #565101
  * debian/eucalyptus.conf: set default JVM_MEM option
  * debian/eucalyptus-common.eucalyptus.upstart: use $JVM_MEM
    from eucalyptus.conf, or default to 512m
  * tools/eucalyptus.conf.5: document the JVM_MEM option

  Cherry-pick upstream commit r1223..1227:
  * node/handlers.c, node/handlers_kvm.c: handle situation where NC's
    do not detach pthreads, LP: #567371
  * node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/StorageManager.java,
    clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/fs/FileSystemStorageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusImageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusManager.java,
    clc/modules/wsstack/src/main/java/com/eucalyptus/ws/handlers/ServiceSinkHandler.java:
    - fix Walrus OOM errors (java heap), LP: #565101
 -- Chris Cheney <email address hidden> Fri, 04 Jun 2010 00:39:00 -0500

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Changed in eucalyptus:
assignee: nobody → chris grzegorczyk (chris-grze)
Revision history for this message
Chris Cheney (ccheney) wrote :

Should this bug be reopened, since we reverted the patch? It seems incorrect to leave it marked Fix Released as least, either Triaged or Won't Fix seem better.

Revision history for this message
C de-Avillez (hggdh2) wrote :

Indeed. This bug is still live, we just reverted the original patch.

Changed in eucalyptus (Ubuntu):
status: Fix Released → Triaged
Changed in eucalyptus (Ubuntu Lucid):
status: Fix Released → Triaged
Thierry Carrez (ttx)
Changed in eucalyptus (Ubuntu):
milestone: lucid-updates → none
Changed in eucalyptus (Ubuntu Lucid):
milestone: lucid-updates → none
Revision history for this message
Thierry Carrez (ttx) wrote :

@Dustin, any chance you can make progress on this bug ? If not, please reassign to Daviey.
Opening maverick task to check that this was fixed in maverick.

Changed in eucalyptus (Ubuntu Lucid):
assignee: Dustin Kirkland (kirkland) → Dave Walker (davewalker)
Changed in eucalyptus (Ubuntu Maverick):
assignee: Dustin Kirkland (kirkland) → Dave Walker (davewalker)
Thierry Carrez (ttx)
tags: added: server-mrs
Revision history for this message
Dave Walker (davewalker) wrote :

Currently blocked on bug #619843. Should hopefully be resolved in the next upload.

Revision history for this message
Dave Walker (davewalker) wrote :

This seems to be fixed in Maverick. Not quite sure when. \o/

Changed in eucalyptus (Ubuntu Maverick):
status: Triaged → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

I really would love to see this fixed in lucid. It is a real pain when you're trying to debug an instance that has rebooted a couple times. I lost a couple hours debugging one that I thought was locked during boot because I only looked at the console output, while the instance was actually fine.

Revision history for this message
Andy Grimm (agrimm) wrote :

This issue is now being tracked upstream at http://eucalyptus.atlassian.net/browse/EUCA-2693

Please watch that issue for further updates.

Revision history for this message
Rolf Leggewie (r0lf) wrote :

lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".

Changed in eucalyptus (Ubuntu Lucid):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.