Merge lp:~chipaca/ubuntuone-client/lean-action-queue-commands into lp:ubuntuone-client

Proposed by John Lenton
Status: Merged
Approved by: Eric Casteleijn
Approved revision: 133
Merged at revision: not available
Proposed branch: lp:~chipaca/ubuntuone-client/lean-action-queue-commands
Merge into: lp:ubuntuone-client
Diff against target: None lines
To merge this branch: bzr merge lp:~chipaca/ubuntuone-client/lean-action-queue-commands
Reviewer Review Type Date Requested Status
Eric Casteleijn (community) Approve
John O'Brien (community) Approve
Review via email: mp+9931@code.launchpad.net

Commit message

improve worst-case memory use, mostly via __slots__

To post a comment you must log in.
Revision history for this message
John Lenton (chipaca) wrote :

This makes a lot of classes of which we have many instances start using __slots__, as well as some minor refactoring and compressing some large strings we store for debugging purposes. The end result is bringing down memory usage by ∼⅓ on a test run of disconnected creation of 10k files (of what's left, ∼⅓ is twisted, ∼⅓ is the syncdaemon itself, and ∼⅓ are the 10k operations).

Also included is a small fix for the .note and .trace loggers, which were broken.

129. By John Lenton

no longer needing functools (thanks jdo!)

130. By John Lenton

merged with trunk

Revision history for this message
John O'Brien (jdobrien) wrote :

Sadly I can't approve these changes as they break trunk.

Lots of errors: https://pastebin.canonical.com/20978/

review: Needs Fixing
Revision history for this message
Elliot Murphy (statik) wrote :

> Sadly I can't approve these changes as they break trunk.
>
> Lots of errors: https://pastebin.canonical.com/20978/

I discussed with John Lenton, these changes expose some broken tests on trunk, so we'll need to wait to land this branch until there is a corresponding trunk branch, and then coordinate the landing to minimize PQM interruption. This branch won't be included in the Aug 11 client rollout. Yay for huge memory use reduction though!

131. By John Lenton

merged with trunk

132. By John Lenton

adding new attributes to __slots__

133. By John Lenton

when refactoring a chain of callbacks, it pays to follow the chain of errbacks too.

Revision history for this message
John Lenton (chipaca) wrote :

It turns out the tests weren't broken; they were exposing brokenness in the client! wow, imagine that!

I believe I've fixed it, now.

Revision history for this message
John O'Brien (jdobrien) wrote :

This no longer breaks trunk! w00t!

review: Approve
Revision history for this message
Eric Casteleijn (thisfred) wrote :

Tests pass, code looks good. Out of curiosity: I've never used __slots__ before, or seen it used much, so for educational purposes: what does it do here? I understand it can be used to save memory in cases where you have a huge number of class instances. Is that the use case here, or is there another reason to use it? (please ignore if you don't have time, I'm really just curious, and this is not a polite/devious way to question the code ;)

review: Approve
Revision history for this message
Nicola Larosa (teknico) wrote :

> Out of curiosity: I've never used __slots__
> before, or seen it used much, so for educational purposes: what does it do
> here? I understand it can be used to save memory in cases where you have a
> huge number of class instances. Is that the use case here, or is there another
> reason to use it?

__slots__ forces the object attributes to be statically predefined instead of dynamic: it does away with __dict__, so it has the double effect to conserve memory and to block dynamic addition/deletion of attributes.

Revision history for this message
John Lenton (chipaca) wrote :

On Wed, Aug 12, 2009 at 01:44:55PM -0000, Eric Casteleijn wrote:
> Review: Approve
> Tests pass, code looks good. Out of curiosity: I've never used __slots__ before, or seen it used much, so for educational purposes: what does it do here? I understand it can be used to save memory in cases where you have a huge number of class instances. Is that the use case here, or is there another reason to use it? (please ignore if you don't have time, I'm really just curious, and this is not a polite/devious way to question the code ;)

in worst-case scnearios there can easily be tens of thousands of
instances of these classes. The test case which showed a 33%
improvement in memory use with this change was based on creating 10k
files while disconnected; memory usage is pretty much linear in that
case (you end up with 10k MakeFile 10k Upload objects). For reference,
the 'drivers' directory of the linux kernel is at ~9k files; a
checkout of the bazaar trunk is only ~1k files, so the impact of this
branch on that would be less.

Still, the worst case is the one we want to avoid killing your
computer, right?

Revision history for this message
Eric Casteleijn (thisfred) wrote :

Thanks Nicola and John. I see now that the merge proposal already answered my question, sorry for being oblivious. But thanks for expanding!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'ubuntuone/syncdaemon/action_queue.py'
--- ubuntuone/syncdaemon/action_queue.py 2009-07-30 18:23:53 +0000
+++ ubuntuone/syncdaemon/action_queue.py 2009-08-08 19:03:32 +0000
@@ -445,6 +445,7 @@
445 we can keep track of the number of bytes that have been written to445 we can keep track of the number of bytes that have been written to
446 the store.446 the store.
447 """447 """
448 __slots__ = ('fd', 'data_dict', 'n_bytes_read')
448 def __init__(self, fd, data_dict):449 def __init__(self, fd, data_dict):
449 """450 """
450 fd is the file-like object used for uploads. data_dict is the451 fd is the file-like object used for uploads. data_dict is the
@@ -741,7 +742,11 @@
741 retryable_errors = set(['Cleaned up', 'TRY_AGAIN',742 retryable_errors = set(['Cleaned up', 'TRY_AGAIN',
742 'Connection was closed cleanly.'])743 'Connection was closed cleanly.'])
743744
744 start_done = False745 __slots__ = ('action_queue', 'start_done', 'running')
746 def __init__(self, action_queue):
747 self.action_queue = action_queue
748 self.start_done = False
749 self.running = False
745750
746 def demark(self, *maybe_markers):751 def demark(self, *maybe_markers):
747 """752 """
@@ -848,11 +853,22 @@
848 else:853 else:
849 d = self._start()854 d = self._start()
850 d.addCallback(self.store_marker_result)855 d.addCallback(self.store_marker_result)
851 d.addCallback(lambda _: setattr(self, 'start_done', True))856 d.addCallback(self._ready_to_run)
852 d.addCallback(lambda _: self.log.debug('running'))857 d.addCallback(self._done_running)
853 d.addCallback(lambda _: self.running and self._run())858 return d
859
860 def _done_running(self, x):
861 self.running = False
862 return x
863
864 def _ready_to_run(self, _):
865 self.log.debug('running')
866 if self.running:
867 d = self._run()
868 else:
869 d = defer.succeed(None)
854 d.addCallbacks(self.end_callback, self.end_errback)870 d.addCallbacks(self.end_callback, self.end_errback)
855 d.addBoth(passit(lambda _: setattr(self, 'running', False)))871 d.addBoth(self._done_running)
856 return d872 return d
857873
858 def handle_success(self, success):874 def handle_success(self, success):
@@ -890,6 +906,7 @@
890 Base of metadata-related commands (commands that are queued in the906 Base of metadata-related commands (commands that are queued in the
891 meta queue)907 meta queue)
892 """908 """
909 __slots__ = ()
893 @property910 @property
894 def _queue(self):911 def _queue(self):
895 """912 """
@@ -903,6 +920,7 @@
903 Base of content-related commands (commands that are queued in the920 Base of content-related commands (commands that are queued in the
904 content queue)921 content queue)
905 """922 """
923 __slots__ = ()
906 @property924 @property
907 def _queue(self):925 def _queue(self):
908 """926 """
@@ -915,8 +933,10 @@
915 """933 """
916 Base of MakeFile and MakeDir934 Base of MakeFile and MakeDir
917 """935 """
936 __slots__ = ('share_id', 'parent_id', 'name', 'marker', 'log')
937
918 def __init__(self, action_queue, share_id, parent_id, name, marker):938 def __init__(self, action_queue, share_id, parent_id, name, marker):
919 self.action_queue = action_queue939 super(MakeThing, self).__init__(action_queue)
920 self.share_id = share_id940 self.share_id = share_id
921 self.parent_id = parent_id941 self.parent_id = parent_id
922 # Unicode boundary! the name is Unicode in protocol and server, but942 # Unicode boundary! the name is Unicode in protocol and server, but
@@ -939,6 +959,7 @@
939 """959 """
940 self.share_id = share_id960 self.share_id = share_id
941 self.parent_id = parent_id961 self.parent_id = parent_id
962 self.start_done = True
942963
943 def _run(self):964 def _run(self):
944 """965 """
@@ -979,6 +1000,7 @@
979 """1000 """
980 Make a file1001 Make a file
981 """1002 """
1003 __slots__ = ()
982 ok_event_name = 'AQ_FILE_NEW_OK'1004 ok_event_name = 'AQ_FILE_NEW_OK'
983 error_event_name = 'AQ_FILE_NEW_ERROR'1005 error_event_name = 'AQ_FILE_NEW_ERROR'
984 client_method = 'make_file'1006 client_method = 'make_file'
@@ -988,6 +1010,7 @@
988 """1010 """
989 Make a directory1011 Make a directory
990 """1012 """
1013 __slots__ = ()
991 ok_event_name = 'AQ_DIR_NEW_OK'1014 ok_event_name = 'AQ_DIR_NEW_OK'
992 error_event_name = 'AQ_DIR_NEW_ERROR'1015 error_event_name = 'AQ_DIR_NEW_ERROR'
993 client_method = 'make_dir'1016 client_method = 'make_dir'
@@ -997,9 +1020,11 @@
997 """1020 """
998 Move a file or directory1021 Move a file or directory
999 """1022 """
1023 __slots__ = ('share_id', 'node_id', 'old_parent_id',
1024 'new_parent_id', 'new_name', 'log')
1000 def __init__(self, action_queue, share_id, node_id, old_parent_id,1025 def __init__(self, action_queue, share_id, node_id, old_parent_id,
1001 new_parent_id, new_name):1026 new_parent_id, new_name):
1002 self.action_queue = action_queue1027 super(Move, self).__init__(action_queue)
1003 self.share_id = share_id1028 self.share_id = share_id
1004 self.node_id = node_id1029 self.node_id = node_id
1005 self.old_parent_id = old_parent_id1030 self.old_parent_id = old_parent_id
@@ -1025,6 +1050,7 @@
1025 self.share_id = share_id1050 self.share_id = share_id
1026 self.node_id = node_id1051 self.node_id = node_id
1027 self.new_parent_id = new_parent_id1052 self.new_parent_id = new_parent_id
1053 self.start_done = True
10281054
1029 def _run(self):1055 def _run(self):
1030 """1056 """
@@ -1060,8 +1086,9 @@
1060 """1086 """
1061 Unlink a file or dir1087 Unlink a file or dir
1062 """1088 """
1089 __slots__ = ('share_id', 'node_id', 'parent_id', 'log')
1063 def __init__(self, action_queue, share_id, parent_id, node_id):1090 def __init__(self, action_queue, share_id, parent_id, node_id):
1064 self.action_queue = action_queue1091 super(Unlink, self).__init__(action_queue)
1065 self.share_id = share_id1092 self.share_id = share_id
1066 self.node_id = node_id1093 self.node_id = node_id
1067 self.parent_id = parent_id1094 self.parent_id = parent_id
@@ -1081,6 +1108,7 @@
1081 self.share_id = share_id1108 self.share_id = share_id
1082 self.node_id = node_id1109 self.node_id = node_id
1083 self.parent_id = parent_id1110 self.parent_id = parent_id
1111 self.start_done = True
10841112
1085 def _run(self):1113 def _run(self):
1086 """1114 """
@@ -1113,12 +1141,13 @@
1113 """1141 """
1114 Ask about the freshness of server hashes1142 Ask about the freshness of server hashes
1115 """1143 """
1144 __slots__ = ('log', 'items')
1116 def __init__(self, action_queue, items):1145 def __init__(self, action_queue, items):
1146 super(Query, self).__init__(action_queue)
1117 self.log = MultiProxy(1147 self.log = MultiProxy(
1118 [mklog(logger, '(unrolled) query', share, node,1148 [mklog(logger, '(unrolled) query', share, node,
1119 share=share, node=node, hash=hash, index=i)1149 share=share, node=node, hash=hash, index=i)
1120 for (i, (share, node, hash)) in enumerate(items)])1150 for (i, (share, node, hash)) in enumerate(items)])
1121 self.action_queue = action_queue
1122 self.items = items1151 self.items = items
11231152
1124 def store_marker_result(self, items):1153 def store_marker_result(self, items):
@@ -1126,6 +1155,7 @@
1126 Called when all the markers are realized.1155 Called when all the markers are realized.
1127 """1156 """
1128 self.items = items1157 self.items = items
1158 self.start_done = True
11291159
1130 def _start(self):1160 def _start(self):
1131 """1161 """
@@ -1169,8 +1199,9 @@
1169 """1199 """
1170 List shares shared to me1200 List shares shared to me
1171 """1201 """
1202 __slots__ = ('log',)
1172 def __init__(self, action_queue):1203 def __init__(self, action_queue):
1173 self.action_queue = action_queue1204 super(ListShares, self).__init__(action_queue)
1174 self.log = mklog(logger, 'list_shares', UNKNOWN, UNKNOWN)1205 self.log = mklog(logger, 'list_shares', UNKNOWN, UNKNOWN)
11751206
1176 def _run(self):1207 def _run(self):
@@ -1198,8 +1229,9 @@
1198 """1229 """
1199 Answer a share offer1230 Answer a share offer
1200 """1231 """
1232 __slots__ = ('share_id', 'answer', 'log')
1201 def __init__(self, action_queue, share_id, answer):1233 def __init__(self, action_queue, share_id, answer):
1202 self.action_queue = action_queue1234 super(AnswerShare, self).__init__(action_queue)
1203 self.share_id = share_id1235 self.share_id = share_id
1204 self.answer = answer1236 self.answer = answer
1205 self.log = mklog(logger, 'answer_share', share_id, UNKNOWN)1237 self.log = mklog(logger, 'answer_share', share_id, UNKNOWN)
@@ -1234,9 +1266,11 @@
1234 """1266 """
1235 Offer a share to somebody1267 Offer a share to somebody
1236 """1268 """
1269 __slots__ = ('node_id', 'share_to', 'name', 'access_level',
1270 'marker', 'log', 'use_http')
1237 def __init__(self, action_queue, node_id, share_to, name, access_level,1271 def __init__(self, action_queue, node_id, share_to, name, access_level,
1238 marker):1272 marker):
1239 self.action_queue = action_queue1273 super(CreateShare, self).__init__(action_queue)
1240 self.node_id = node_id1274 self.node_id = node_id
1241 self.share_to = share_to1275 self.share_to = share_to
1242 self.name = name1276 self.name = name
@@ -1253,6 +1287,7 @@
1253 Called when all the markers are realized.1287 Called when all the markers are realized.
1254 """1288 """
1255 self.node_id = node_id1289 self.node_id = node_id
1290 self.start_done = True
12561291
1257 def _start(self):1292 def _start(self):
1258 """1293 """
@@ -1325,14 +1360,16 @@
1325 error=failure.getErrorMessage())1360 error=failure.getErrorMessage())
13261361
13271362
1328class GetContentMixin(object):1363class GetContentMixin(ActionQueueCommand):
1329 """1364 """
1330 Base for ListDir and Download. It's a mixin (ugh) because1365 Base for ListDir and Download. It's a mixin (ugh) because
1331 otherwise things would be even more confusing1366 otherwise things would be even more confusing
1332 """1367 """
1368 __slots__ = ('share_id', 'node_id', 'server_hash', 'fileobj_factory',
1369 'fileobj', 'gunzip', 'log')
1333 def __init__(self, action_queue, share_id, node_id, server_hash,1370 def __init__(self, action_queue, share_id, node_id, server_hash,
1334 fileobj_factory):1371 fileobj_factory):
1335 self.action_queue = action_queue1372 super(GetContentMixin, self).__init__(action_queue)
1336 self.share_id = share_id1373 self.share_id = share_id
1337 self.node_id = node_id1374 self.node_id = node_id
1338 self.server_hash = server_hash1375 self.server_hash = server_hash
@@ -1356,6 +1393,7 @@
1356 Called when all the markers are realized.1393 Called when all the markers are realized.
1357 """1394 """
1358 self.node_id = node_id1395 self.node_id = node_id
1396 self.start_done = True
13591397
1360 def _run(self):1398 def _run(self):
1361 """1399 """
@@ -1449,22 +1487,27 @@
1449 """1487 """
1450 Get a listing of a directory's contents1488 Get a listing of a directory's contents
1451 """1489 """
1490 __slots__ = ()
14521491
1453class Download(GetContentMixin, ActionQueueContentCommand):1492class Download(GetContentMixin, ActionQueueContentCommand):
1454 """1493 """
1455 Get the contents of a file.1494 Get the contents of a file.
1456 """1495 """
1496 __slots__ = ()
14571497
1458class Upload(ActionQueueContentCommand):1498class Upload(ActionQueueContentCommand):
1459 """1499 """
1460 Upload stuff to a file1500 Upload stuff to a file
1461 """1501 """
1502 __slots__ = ('share_id', 'node_id', 'previous_hash', 'hash', 'crc32',
1503 'size', 'fileobj_factory', 'tempfile_factory', 'deflated_size',
1504 'tempfile', 'cancelled', 'upload_req', 'log')
1462 retryable_errors = (ActionQueueContentCommand.retryable_errors1505 retryable_errors = (ActionQueueContentCommand.retryable_errors
1463 | set(['UPLOAD_IN_PROGRESS']))1506 | set(['UPLOAD_IN_PROGRESS']))
14641507
1465 def __init__(self, action_queue, share_id, node_id, previous_hash, hash,1508 def __init__(self, action_queue, share_id, node_id, previous_hash, hash,
1466 crc32, size, fileobj_factory, tempfile_factory):1509 crc32, size, fileobj_factory, tempfile_factory):
1467 self.action_queue = action_queue1510 super(Upload, self).__init__(action_queue)
1468 self.share_id = share_id1511 self.share_id = share_id
1469 self.node_id = node_id1512 self.node_id = node_id
1470 self.previous_hash = previous_hash1513 self.previous_hash = previous_hash
@@ -1521,7 +1564,7 @@
1521 self.node_id))1564 self.node_id))
1522 self.node_id = node_id1565 self.node_id = node_id
1523 self.action_queue.uploading[self.share_id, node_id] = uploading1566 self.action_queue.uploading[self.share_id, node_id] = uploading
15241567 self.start_done = True
15251568
1526 def _run(self):1569 def _run(self):
1527 """1570 """
15281571
=== modified file 'ubuntuone/syncdaemon/fsm/fsm.py'
--- ubuntuone/syncdaemon/fsm/fsm.py 2009-07-02 20:22:51 +0000
+++ ubuntuone/syncdaemon/fsm/fsm.py 2009-08-10 16:20:52 +0000
@@ -469,6 +469,9 @@
469 get one of these. with the corresponding attributes for all sections469 get one of these. with the corresponding attributes for all sections
470 and event name.470 and event name.
471 """471 """
472 __slots__ = ('event', 'line', 'source',
473 'target', 'parameters', 'action_func')
474
472 def __init__(self, event, line):475 def __init__(self, event, line):
473 """Create a transition for event event from line.476 """Create a transition for event event from line.
474477
475478
=== modified file 'ubuntuone/syncdaemon/logger.py'
--- ubuntuone/syncdaemon/logger.py 2009-08-06 02:03:53 +0000
+++ ubuntuone/syncdaemon/logger.py 2009-08-10 16:48:13 +0000
@@ -23,6 +23,7 @@
23import sys23import sys
24import os24import os
25import re25import re
26import zlib
26import xdg.BaseDirectory27import xdg.BaseDirectory
2728
28from logging.handlers import TimedRotatingFileHandler29from logging.handlers import TimedRotatingFileHandler
@@ -40,9 +41,15 @@
4041
41class Logger(logging.Logger):42class Logger(logging.Logger):
42 """Logger that support out custom levels."""43 """Logger that support out custom levels."""
44 def note(self, msg, *args, **kwargs):
45 """log at NOTE level"""
46 if self.isEnabledFor(NOTE):
47 self._log(NOTE, msg, args, **kwargs)
4348
44 note = functools.partial(logging.Logger.log, NOTE)49 def trace(self, msg, *args, **kwargs):
45 trace = functools.partial(logging.Logger.log, TRACE)50 """log at TRACE level"""
51 if self.isEnabledFor(TRACE):
52 self._log(TRACE, msg, args, **kwargs)
4653
47# use our logger as the default Logger class54# use our logger as the default Logger class
48logging.setLoggerClass(Logger)55logging.setLoggerClass(Logger)
@@ -72,6 +79,7 @@
72 Create a logger that keeps track of the method where it's being79 Create a logger that keeps track of the method where it's being
73 called from, in order to make more informative messages.80 called from, in order to make more informative messages.
74 """81 """
82 __slots__ = ('logger', 'zipped_desc')
75 def __init__(self, _logger, _method, _share, _uid, *args, **kwargs):83 def __init__(self, _logger, _method, _share, _uid, *args, **kwargs):
76 # args are _-prepended to lower the chances of them84 # args are _-prepended to lower the chances of them
77 # conflicting with kwargs85 # conflicting with kwargs
@@ -86,9 +94,15 @@
86 all_args.append("%s=%r" % (k, v))94 all_args.append("%s=%r" % (k, v))
87 args = ", ".join(all_args)95 args = ", ".join(all_args)
8896
89 self.desc = "%-28s share:%-40r node:%-40r %s(%s) " \97 desc = "%-28s share:%-40r node:%-40r %s(%s) " % (_method, _share,
90 % (_method, _share, _uid, _method, args)98 _uid, _method, args)
99 self.zipped_desc = zlib.compress(desc, 9)
91 self.logger = _logger100 self.logger = _logger
101
102 @property
103 def desc(self):
104 return zlib.decompress(self.zipped_desc)
105
92 def _log(self, logger_func, *args):106 def _log(self, logger_func, *args):
93 """107 """
94 Generalized form of the different logging functions.108 Generalized form of the different logging functions.
95109
=== modified file 'ubuntuone/syncdaemon/main.py'
--- ubuntuone/syncdaemon/main.py 2009-08-06 04:24:04 +0000
+++ ubuntuone/syncdaemon/main.py 2009-08-08 19:03:32 +0000
@@ -96,7 +96,7 @@
9696
97 def log_mark(self):97 def log_mark(self):
98 """ log a "mark" that includes the current AQ state and queue size"""98 """ log a "mark" that includes the current AQ state and queue size"""
99 self.logger.info("%s %s (state: %s; queues: metadata: %d; content: %d;"99 self.logger.note("%s %s (state: %s; queues: metadata: %d; content: %d;"
100 " hash: %d, fsm-cache: hit=%d miss=%d) %s" % ('-'*4,100 " hash: %d, fsm-cache: hit=%d miss=%d) %s" % ('-'*4,
101 'MARK', self.state.name,101 'MARK', self.state.name,
102 len(self.action_q.meta_queue),102 len(self.action_q.meta_queue),
103103
=== modified file 'ubuntuone/syncdaemon/sync.py'
--- ubuntuone/syncdaemon/sync.py 2009-08-04 20:00:44 +0000
+++ ubuntuone/syncdaemon/sync.py 2009-08-08 19:03:32 +0000
@@ -37,6 +37,7 @@
37class FSKey(object):37class FSKey(object):
38 """This encapsulates the problem of getting the metadata with different38 """This encapsulates the problem of getting the metadata with different
39 keys."""39 keys."""
40 __slots__ = ('fs', 'keys')
4041
41 def __init__(self, fs, **keys):42 def __init__(self, fs, **keys):
42 """create"""43 """create"""
@@ -204,6 +205,7 @@
204205
205class FileLogger(object):206class FileLogger(object):
206 """A logger that knows about the file and its state."""207 """A logger that knows about the file and its state."""
208 __slots__ = ('logger', 'key')
207209
208 def __init__(self, logger, key):210 def __init__(self, logger, key):
209 """Create a logger for this guy"""211 """Create a logger for this guy"""

Subscribers

People subscribed via source and target branches