Comment 1 for bug 1980150

Revision history for this message
Chad Smith (chad.smith) wrote :

I can confirm with memory_profiler, if collect logs tries to _write_command_output_to_file(['cat', 'some-multi-gig-file'], "/tmp/filecopy", verbosity=0) we can get into incredibly high memory consumption and resulting OOM issues.

Simply inverting this call to something like the following python avoids realizing the entire stream in memory:

  with open("/tmp/filecopy", "w") as f:
      subprocess.call(['cat', 'some-multi-gig-file'], stdout=f)

In profiling a 4 Gig file "copy" I can see we our memory footprint is capped at about 30 MiB instead of 4 Gig for file creation.

Line # Mem usage Increment Occurrences Line Contents
=============================================================
     6 32.4 MiB 32.4 MiB 1 @memory_profiler.profile
     7 def doit():
     8 32.4 MiB 0.0 MiB 1 with open("/tmp/filecopy", "w") as f:
     9 32.6 MiB 0.2 MiB 1 subprocess.call(["cat", "/some-multi-gig-file"], stdout=f)