cloud-init collect-logs can use too much memory

Bug #1980150 reported by Pradip Dhara
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Medium
Unassigned

Bug Description

if the journal is large, or the machine doesn't have a lot of memory, cloud-init collect-logs can cause an OOM.

The problem is that we are reading the entire journal into memory and then writing it out:
https://github.com/canonical/cloud-init/blob/a23c886ea2cd301b6021eb03636beb5b92c429dc/cloudinit/cmd/devel/logs.py#L151

We should not buffer the entire journal in memory. I think redirecting it to an output file would not cause a memory spike.

Thanks,
Pradip Dhara

1. cloud-provider: azure
2. i don't think the cloud-init configuration is relevant here. But, I can provide it if needed.
3. can't do because cloud-init collect-logs is crashing
4. i can get dmesg logs if you like. But, I don't think it is relevant to this bug.

Tags: bitesize
Revision history for this message
Chad Smith (chad.smith) wrote :

I can confirm with memory_profiler, if collect logs tries to _write_command_output_to_file(['cat', 'some-multi-gig-file'], "/tmp/filecopy", verbosity=0) we can get into incredibly high memory consumption and resulting OOM issues.

Simply inverting this call to something like the following python avoids realizing the entire stream in memory:

  with open("/tmp/filecopy", "w") as f:
      subprocess.call(['cat', 'some-multi-gig-file'], stdout=f)

In profiling a 4 Gig file "copy" I can see we our memory footprint is capped at about 30 MiB instead of 4 Gig for file creation.

Line # Mem usage Increment Occurrences Line Contents
=============================================================
     6 32.4 MiB 32.4 MiB 1 @memory_profiler.profile
     7 def doit():
     8 32.4 MiB 0.0 MiB 1 with open("/tmp/filecopy", "w") as f:
     9 32.6 MiB 0.2 MiB 1 subprocess.call(["cat", "/some-multi-gig-file"], stdout=f)

Changed in cloud-init:
status: New → Triaged
tags: added: bitesize
Changed in cloud-init:
importance: Undecided → Low
importance: Low → Medium
Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
Revision history for this message
Chad Smith (chad.smith) wrote : Fixed in cloud-init version 23.3.

This bug is believed to be fixed in cloud-init in version 23.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Expired → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.