Merge lp:~chipaca/ubuntuone-client/lean-action-queue-commands into lp:ubuntuone-client
- lean-action-queue-commands
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Eric Casteleijn | ||||
Approved revision: | 133 | ||||
Merged at revision: | not available | ||||
Proposed branch: | lp:~chipaca/ubuntuone-client/lean-action-queue-commands | ||||
Merge into: | lp:ubuntuone-client | ||||
Diff against target: | None lines | ||||
To merge this branch: | bzr merge lp:~chipaca/ubuntuone-client/lean-action-queue-commands | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Eric Casteleijn (community) | Approve | ||
John O'Brien (community) | Approve | ||
Review via email: mp+9931@code.launchpad.net |
Commit message
improve worst-case memory use, mostly via __slots__
Description of the change
John Lenton (chipaca) wrote : | # |
- 129. By John Lenton
-
no longer needing functools (thanks jdo!)
- 130. By John Lenton
-
merged with trunk
John O'Brien (jdobrien) wrote : | # |
Sadly I can't approve these changes as they break trunk.
Lots of errors: https:/
Elliot Murphy (statik) wrote : | # |
> Sadly I can't approve these changes as they break trunk.
>
> Lots of errors: https:/
I discussed with John Lenton, these changes expose some broken tests on trunk, so we'll need to wait to land this branch until there is a corresponding trunk branch, and then coordinate the landing to minimize PQM interruption. This branch won't be included in the Aug 11 client rollout. Yay for huge memory use reduction though!
- 131. By John Lenton
-
merged with trunk
- 132. By John Lenton
-
adding new attributes to __slots__
- 133. By John Lenton
-
when refactoring a chain of callbacks, it pays to follow the chain of errbacks too.
John Lenton (chipaca) wrote : | # |
It turns out the tests weren't broken; they were exposing brokenness in the client! wow, imagine that!
I believe I've fixed it, now.
John O'Brien (jdobrien) wrote : | # |
This no longer breaks trunk! w00t!
Eric Casteleijn (thisfred) wrote : | # |
Tests pass, code looks good. Out of curiosity: I've never used __slots__ before, or seen it used much, so for educational purposes: what does it do here? I understand it can be used to save memory in cases where you have a huge number of class instances. Is that the use case here, or is there another reason to use it? (please ignore if you don't have time, I'm really just curious, and this is not a polite/devious way to question the code ;)
Nicola Larosa (teknico) wrote : | # |
> Out of curiosity: I've never used __slots__
> before, or seen it used much, so for educational purposes: what does it do
> here? I understand it can be used to save memory in cases where you have a
> huge number of class instances. Is that the use case here, or is there another
> reason to use it?
__slots__ forces the object attributes to be statically predefined instead of dynamic: it does away with __dict__, so it has the double effect to conserve memory and to block dynamic addition/deletion of attributes.
John Lenton (chipaca) wrote : | # |
On Wed, Aug 12, 2009 at 01:44:55PM -0000, Eric Casteleijn wrote:
> Review: Approve
> Tests pass, code looks good. Out of curiosity: I've never used __slots__ before, or seen it used much, so for educational purposes: what does it do here? I understand it can be used to save memory in cases where you have a huge number of class instances. Is that the use case here, or is there another reason to use it? (please ignore if you don't have time, I'm really just curious, and this is not a polite/devious way to question the code ;)
in worst-case scnearios there can easily be tens of thousands of
instances of these classes. The test case which showed a 33%
improvement in memory use with this change was based on creating 10k
files while disconnected; memory usage is pretty much linear in that
case (you end up with 10k MakeFile 10k Upload objects). For reference,
the 'drivers' directory of the linux kernel is at ~9k files; a
checkout of the bazaar trunk is only ~1k files, so the impact of this
branch on that would be less.
Still, the worst case is the one we want to avoid killing your
computer, right?
Eric Casteleijn (thisfred) wrote : | # |
Thanks Nicola and John. I see now that the merge proposal already answered my question, sorry for being oblivious. But thanks for expanding!
Preview Diff
1 | === modified file 'ubuntuone/syncdaemon/action_queue.py' |
2 | --- ubuntuone/syncdaemon/action_queue.py 2009-07-30 18:23:53 +0000 |
3 | +++ ubuntuone/syncdaemon/action_queue.py 2009-08-08 19:03:32 +0000 |
4 | @@ -445,6 +445,7 @@ |
5 | we can keep track of the number of bytes that have been written to |
6 | the store. |
7 | """ |
8 | + __slots__ = ('fd', 'data_dict', 'n_bytes_read') |
9 | def __init__(self, fd, data_dict): |
10 | """ |
11 | fd is the file-like object used for uploads. data_dict is the |
12 | @@ -741,7 +742,11 @@ |
13 | retryable_errors = set(['Cleaned up', 'TRY_AGAIN', |
14 | 'Connection was closed cleanly.']) |
15 | |
16 | - start_done = False |
17 | + __slots__ = ('action_queue', 'start_done', 'running') |
18 | + def __init__(self, action_queue): |
19 | + self.action_queue = action_queue |
20 | + self.start_done = False |
21 | + self.running = False |
22 | |
23 | def demark(self, *maybe_markers): |
24 | """ |
25 | @@ -848,11 +853,22 @@ |
26 | else: |
27 | d = self._start() |
28 | d.addCallback(self.store_marker_result) |
29 | - d.addCallback(lambda _: setattr(self, 'start_done', True)) |
30 | - d.addCallback(lambda _: self.log.debug('running')) |
31 | - d.addCallback(lambda _: self.running and self._run()) |
32 | + d.addCallback(self._ready_to_run) |
33 | + d.addCallback(self._done_running) |
34 | + return d |
35 | + |
36 | + def _done_running(self, x): |
37 | + self.running = False |
38 | + return x |
39 | + |
40 | + def _ready_to_run(self, _): |
41 | + self.log.debug('running') |
42 | + if self.running: |
43 | + d = self._run() |
44 | + else: |
45 | + d = defer.succeed(None) |
46 | d.addCallbacks(self.end_callback, self.end_errback) |
47 | - d.addBoth(passit(lambda _: setattr(self, 'running', False))) |
48 | + d.addBoth(self._done_running) |
49 | return d |
50 | |
51 | def handle_success(self, success): |
52 | @@ -890,6 +906,7 @@ |
53 | Base of metadata-related commands (commands that are queued in the |
54 | meta queue) |
55 | """ |
56 | + __slots__ = () |
57 | @property |
58 | def _queue(self): |
59 | """ |
60 | @@ -903,6 +920,7 @@ |
61 | Base of content-related commands (commands that are queued in the |
62 | content queue) |
63 | """ |
64 | + __slots__ = () |
65 | @property |
66 | def _queue(self): |
67 | """ |
68 | @@ -915,8 +933,10 @@ |
69 | """ |
70 | Base of MakeFile and MakeDir |
71 | """ |
72 | + __slots__ = ('share_id', 'parent_id', 'name', 'marker', 'log') |
73 | + |
74 | def __init__(self, action_queue, share_id, parent_id, name, marker): |
75 | - self.action_queue = action_queue |
76 | + super(MakeThing, self).__init__(action_queue) |
77 | self.share_id = share_id |
78 | self.parent_id = parent_id |
79 | # Unicode boundary! the name is Unicode in protocol and server, but |
80 | @@ -939,6 +959,7 @@ |
81 | """ |
82 | self.share_id = share_id |
83 | self.parent_id = parent_id |
84 | + self.start_done = True |
85 | |
86 | def _run(self): |
87 | """ |
88 | @@ -979,6 +1000,7 @@ |
89 | """ |
90 | Make a file |
91 | """ |
92 | + __slots__ = () |
93 | ok_event_name = 'AQ_FILE_NEW_OK' |
94 | error_event_name = 'AQ_FILE_NEW_ERROR' |
95 | client_method = 'make_file' |
96 | @@ -988,6 +1010,7 @@ |
97 | """ |
98 | Make a directory |
99 | """ |
100 | + __slots__ = () |
101 | ok_event_name = 'AQ_DIR_NEW_OK' |
102 | error_event_name = 'AQ_DIR_NEW_ERROR' |
103 | client_method = 'make_dir' |
104 | @@ -997,9 +1020,11 @@ |
105 | """ |
106 | Move a file or directory |
107 | """ |
108 | + __slots__ = ('share_id', 'node_id', 'old_parent_id', |
109 | + 'new_parent_id', 'new_name', 'log') |
110 | def __init__(self, action_queue, share_id, node_id, old_parent_id, |
111 | new_parent_id, new_name): |
112 | - self.action_queue = action_queue |
113 | + super(Move, self).__init__(action_queue) |
114 | self.share_id = share_id |
115 | self.node_id = node_id |
116 | self.old_parent_id = old_parent_id |
117 | @@ -1025,6 +1050,7 @@ |
118 | self.share_id = share_id |
119 | self.node_id = node_id |
120 | self.new_parent_id = new_parent_id |
121 | + self.start_done = True |
122 | |
123 | def _run(self): |
124 | """ |
125 | @@ -1060,8 +1086,9 @@ |
126 | """ |
127 | Unlink a file or dir |
128 | """ |
129 | + __slots__ = ('share_id', 'node_id', 'parent_id', 'log') |
130 | def __init__(self, action_queue, share_id, parent_id, node_id): |
131 | - self.action_queue = action_queue |
132 | + super(Unlink, self).__init__(action_queue) |
133 | self.share_id = share_id |
134 | self.node_id = node_id |
135 | self.parent_id = parent_id |
136 | @@ -1081,6 +1108,7 @@ |
137 | self.share_id = share_id |
138 | self.node_id = node_id |
139 | self.parent_id = parent_id |
140 | + self.start_done = True |
141 | |
142 | def _run(self): |
143 | """ |
144 | @@ -1113,12 +1141,13 @@ |
145 | """ |
146 | Ask about the freshness of server hashes |
147 | """ |
148 | + __slots__ = ('log', 'items') |
149 | def __init__(self, action_queue, items): |
150 | + super(Query, self).__init__(action_queue) |
151 | self.log = MultiProxy( |
152 | [mklog(logger, '(unrolled) query', share, node, |
153 | share=share, node=node, hash=hash, index=i) |
154 | for (i, (share, node, hash)) in enumerate(items)]) |
155 | - self.action_queue = action_queue |
156 | self.items = items |
157 | |
158 | def store_marker_result(self, items): |
159 | @@ -1126,6 +1155,7 @@ |
160 | Called when all the markers are realized. |
161 | """ |
162 | self.items = items |
163 | + self.start_done = True |
164 | |
165 | def _start(self): |
166 | """ |
167 | @@ -1169,8 +1199,9 @@ |
168 | """ |
169 | List shares shared to me |
170 | """ |
171 | + __slots__ = ('log',) |
172 | def __init__(self, action_queue): |
173 | - self.action_queue = action_queue |
174 | + super(ListShares, self).__init__(action_queue) |
175 | self.log = mklog(logger, 'list_shares', UNKNOWN, UNKNOWN) |
176 | |
177 | def _run(self): |
178 | @@ -1198,8 +1229,9 @@ |
179 | """ |
180 | Answer a share offer |
181 | """ |
182 | + __slots__ = ('share_id', 'answer', 'log') |
183 | def __init__(self, action_queue, share_id, answer): |
184 | - self.action_queue = action_queue |
185 | + super(AnswerShare, self).__init__(action_queue) |
186 | self.share_id = share_id |
187 | self.answer = answer |
188 | self.log = mklog(logger, 'answer_share', share_id, UNKNOWN) |
189 | @@ -1234,9 +1266,11 @@ |
190 | """ |
191 | Offer a share to somebody |
192 | """ |
193 | + __slots__ = ('node_id', 'share_to', 'name', 'access_level', |
194 | + 'marker', 'log', 'use_http') |
195 | def __init__(self, action_queue, node_id, share_to, name, access_level, |
196 | marker): |
197 | - self.action_queue = action_queue |
198 | + super(CreateShare, self).__init__(action_queue) |
199 | self.node_id = node_id |
200 | self.share_to = share_to |
201 | self.name = name |
202 | @@ -1253,6 +1287,7 @@ |
203 | Called when all the markers are realized. |
204 | """ |
205 | self.node_id = node_id |
206 | + self.start_done = True |
207 | |
208 | def _start(self): |
209 | """ |
210 | @@ -1325,14 +1360,16 @@ |
211 | error=failure.getErrorMessage()) |
212 | |
213 | |
214 | -class GetContentMixin(object): |
215 | +class GetContentMixin(ActionQueueCommand): |
216 | """ |
217 | Base for ListDir and Download. It's a mixin (ugh) because |
218 | otherwise things would be even more confusing |
219 | """ |
220 | + __slots__ = ('share_id', 'node_id', 'server_hash', 'fileobj_factory', |
221 | + 'fileobj', 'gunzip', 'log') |
222 | def __init__(self, action_queue, share_id, node_id, server_hash, |
223 | fileobj_factory): |
224 | - self.action_queue = action_queue |
225 | + super(GetContentMixin, self).__init__(action_queue) |
226 | self.share_id = share_id |
227 | self.node_id = node_id |
228 | self.server_hash = server_hash |
229 | @@ -1356,6 +1393,7 @@ |
230 | Called when all the markers are realized. |
231 | """ |
232 | self.node_id = node_id |
233 | + self.start_done = True |
234 | |
235 | def _run(self): |
236 | """ |
237 | @@ -1449,22 +1487,27 @@ |
238 | """ |
239 | Get a listing of a directory's contents |
240 | """ |
241 | + __slots__ = () |
242 | |
243 | class Download(GetContentMixin, ActionQueueContentCommand): |
244 | """ |
245 | Get the contents of a file. |
246 | """ |
247 | + __slots__ = () |
248 | |
249 | class Upload(ActionQueueContentCommand): |
250 | """ |
251 | Upload stuff to a file |
252 | """ |
253 | + __slots__ = ('share_id', 'node_id', 'previous_hash', 'hash', 'crc32', |
254 | + 'size', 'fileobj_factory', 'tempfile_factory', 'deflated_size', |
255 | + 'tempfile', 'cancelled', 'upload_req', 'log') |
256 | retryable_errors = (ActionQueueContentCommand.retryable_errors |
257 | | set(['UPLOAD_IN_PROGRESS'])) |
258 | |
259 | def __init__(self, action_queue, share_id, node_id, previous_hash, hash, |
260 | crc32, size, fileobj_factory, tempfile_factory): |
261 | - self.action_queue = action_queue |
262 | + super(Upload, self).__init__(action_queue) |
263 | self.share_id = share_id |
264 | self.node_id = node_id |
265 | self.previous_hash = previous_hash |
266 | @@ -1521,7 +1564,7 @@ |
267 | self.node_id)) |
268 | self.node_id = node_id |
269 | self.action_queue.uploading[self.share_id, node_id] = uploading |
270 | - |
271 | + self.start_done = True |
272 | |
273 | def _run(self): |
274 | """ |
275 | |
276 | === modified file 'ubuntuone/syncdaemon/fsm/fsm.py' |
277 | --- ubuntuone/syncdaemon/fsm/fsm.py 2009-07-02 20:22:51 +0000 |
278 | +++ ubuntuone/syncdaemon/fsm/fsm.py 2009-08-10 16:20:52 +0000 |
279 | @@ -469,6 +469,9 @@ |
280 | get one of these. with the corresponding attributes for all sections |
281 | and event name. |
282 | """ |
283 | + __slots__ = ('event', 'line', 'source', |
284 | + 'target', 'parameters', 'action_func') |
285 | + |
286 | def __init__(self, event, line): |
287 | """Create a transition for event event from line. |
288 | |
289 | |
290 | === modified file 'ubuntuone/syncdaemon/logger.py' |
291 | --- ubuntuone/syncdaemon/logger.py 2009-08-06 02:03:53 +0000 |
292 | +++ ubuntuone/syncdaemon/logger.py 2009-08-10 16:48:13 +0000 |
293 | @@ -23,6 +23,7 @@ |
294 | import sys |
295 | import os |
296 | import re |
297 | +import zlib |
298 | import xdg.BaseDirectory |
299 | |
300 | from logging.handlers import TimedRotatingFileHandler |
301 | @@ -40,9 +41,15 @@ |
302 | |
303 | class Logger(logging.Logger): |
304 | """Logger that support out custom levels.""" |
305 | + def note(self, msg, *args, **kwargs): |
306 | + """log at NOTE level""" |
307 | + if self.isEnabledFor(NOTE): |
308 | + self._log(NOTE, msg, args, **kwargs) |
309 | |
310 | - note = functools.partial(logging.Logger.log, NOTE) |
311 | - trace = functools.partial(logging.Logger.log, TRACE) |
312 | + def trace(self, msg, *args, **kwargs): |
313 | + """log at TRACE level""" |
314 | + if self.isEnabledFor(TRACE): |
315 | + self._log(TRACE, msg, args, **kwargs) |
316 | |
317 | # use our logger as the default Logger class |
318 | logging.setLoggerClass(Logger) |
319 | @@ -72,6 +79,7 @@ |
320 | Create a logger that keeps track of the method where it's being |
321 | called from, in order to make more informative messages. |
322 | """ |
323 | + __slots__ = ('logger', 'zipped_desc') |
324 | def __init__(self, _logger, _method, _share, _uid, *args, **kwargs): |
325 | # args are _-prepended to lower the chances of them |
326 | # conflicting with kwargs |
327 | @@ -86,9 +94,15 @@ |
328 | all_args.append("%s=%r" % (k, v)) |
329 | args = ", ".join(all_args) |
330 | |
331 | - self.desc = "%-28s share:%-40r node:%-40r %s(%s) " \ |
332 | - % (_method, _share, _uid, _method, args) |
333 | + desc = "%-28s share:%-40r node:%-40r %s(%s) " % (_method, _share, |
334 | + _uid, _method, args) |
335 | + self.zipped_desc = zlib.compress(desc, 9) |
336 | self.logger = _logger |
337 | + |
338 | + @property |
339 | + def desc(self): |
340 | + return zlib.decompress(self.zipped_desc) |
341 | + |
342 | def _log(self, logger_func, *args): |
343 | """ |
344 | Generalized form of the different logging functions. |
345 | |
346 | === modified file 'ubuntuone/syncdaemon/main.py' |
347 | --- ubuntuone/syncdaemon/main.py 2009-08-06 04:24:04 +0000 |
348 | +++ ubuntuone/syncdaemon/main.py 2009-08-08 19:03:32 +0000 |
349 | @@ -96,7 +96,7 @@ |
350 | |
351 | def log_mark(self): |
352 | """ log a "mark" that includes the current AQ state and queue size""" |
353 | - self.logger.info("%s %s (state: %s; queues: metadata: %d; content: %d;" |
354 | + self.logger.note("%s %s (state: %s; queues: metadata: %d; content: %d;" |
355 | " hash: %d, fsm-cache: hit=%d miss=%d) %s" % ('-'*4, |
356 | 'MARK', self.state.name, |
357 | len(self.action_q.meta_queue), |
358 | |
359 | === modified file 'ubuntuone/syncdaemon/sync.py' |
360 | --- ubuntuone/syncdaemon/sync.py 2009-08-04 20:00:44 +0000 |
361 | +++ ubuntuone/syncdaemon/sync.py 2009-08-08 19:03:32 +0000 |
362 | @@ -37,6 +37,7 @@ |
363 | class FSKey(object): |
364 | """This encapsulates the problem of getting the metadata with different |
365 | keys.""" |
366 | + __slots__ = ('fs', 'keys') |
367 | |
368 | def __init__(self, fs, **keys): |
369 | """create""" |
370 | @@ -204,6 +205,7 @@ |
371 | |
372 | class FileLogger(object): |
373 | """A logger that knows about the file and its state.""" |
374 | + __slots__ = ('logger', 'key') |
375 | |
376 | def __init__(self, logger, key): |
377 | """Create a logger for this guy""" |
This makes a lot of classes of which we have many instances start using __slots__, as well as some minor refactoring and compressing some large strings we store for debugging purposes. The end result is bringing down memory usage by ∼⅓ on a test run of disconnected creation of 10k files (of what's left, ∼⅓ is twisted, ∼⅓ is the syncdaemon itself, and ∼⅓ are the 10k operations).
Also included is a small fix for the .note and .trace loggers, which were broken.