33` byexample ` can execute the examples of the given file in parallel (or
44concurrently to be more precise).
55
6- By default only one files is processed each time but more can be added
6+ By default only one file is processed each time but more can be added
77with the ` --jobs ` command line option.
88
99But exactly how this is done was never officially documented.
1010
11+ This documents describes how ` byexample ` implements ` --jobs ` and how
12+ that could affect the implementation of the modules/plugins.
1113
1214## Some history
1315
@@ -24,7 +26,8 @@ they were created using
2426The copies were done using a ` fork ` method. This was a very simple
2527method supported by Linux and MacOS.
2628
27- But ` fork ` didn't work well in every case in MacOS in ` byexample 10.0.0 `
29+ But ` fork ` didn't work well in every case (specially under MacOS)
30+ and in ` byexample 10.0.0 `
2831we decided to change the concurrency model from ` multiprocessing ` to
2932` multithreading ` .
3033
@@ -40,7 +43,7 @@ memory.
4043On the other hand, due how Python works, you may loose a little of
4144parallelism.
4245
43- The good news is that this method is will supported in Linux, MacOS and
46+ The good news is that this method is well supported in Linux, MacOS and
4447even in Windows.
4548
4649## The N+1 creation rule
@@ -125,38 +128,46 @@ Something like this:
125128... self .longest = max (self .longest, elapsed)
126129```
127130
128- Ended, ` self.longest ` will have the elapsed time of the slowest example
129- * for each worker* .
131+ Because ` MeasureTime ` is instantiated * once per worker * , ` self.longest `
132+ will have the elapsed time of the slowest example * of that particular worker* .
130133
131- But if you want to have a global view and see the slowest example of
134+ But if you want to have a * global* view and see the slowest example of
132135* all the workers* ?
133136
134137You need to * share* information among the workers so we need to modify
135138the ` MeasureTime ` a little.
136139
137- First we need a * shared dictionary* to store per worker, a * lock* to
138- synchronize the access and ` job_number ` will represent each worker.
140+ First we need a * shared dictionary* to store the slowest example per worker,
141+ a * lock* to synchronize the access and ` job_number ` will represent each worker.
142+
143+ This is the modified ` __init__ ` of ` MeasureTime ` :
139144
140145``` python
141146>> > def __init__ (self , sharer , ns , job_number , ** kargs ):
142147... self .begin = 0
143148...
144149... if sharer is not None :
145150... # we are in the main thread, we can use the sharer
146- ... # and we can store things in the namespace
151+ ... # and we can ** store** things in the namespace
147152... ns.elapsed_times_by_worker = sharer.dict()
148153... ns.lock = sharer.RLock()
149154...
150155... else :
151156... # we are in the worker thread, save the job/worker number
152- ... # and keep a private reference to the dictionary and lock
153- ... # note that the namespace is read-only here
154157... self .my_number = job_number
158+ ...
159+ ... # keep a private reference to the dictionary and lock
160+ ... # created above
161+ ... # these are *shared* among other instances of MeasureTime
162+ ... # so we must use it with care.
163+ ... #
164+ ... # note that the namespace is **read-only** here
155165... self .elapsed_times_by_worker = ns.elapsed_times_by_worker
156166... self .lock = ns.lock
157167```
158168
159- Now, on the ` end_example ` we need to store the longest elapsed time:
169+ Now, on the ` end_example ` we need to store the longest elapsed time
170+ * among* all the workers (among all the instances of ` MeasureTime ` ):
160171
161172``` python
162173>> > def end_example (self , * args ):
@@ -167,6 +178,9 @@ Now, on the `end_example` we need to store the longest elapsed time:
167178... self .elapsed_times_by_worker[self .my_number] = my_longest
168179```
169180
181+ Because ` elapsed_times_by_worker ` is a * shared* we need to access it
182+ atomically. For this we take the ` lock ` first.
183+
170184The standard [ byexample/modules/progress.py] ( https://github.com/byexamples/byexample/tree/master/byexample/modules/progress.py )
171185is also an example of this: there the ` Concern ` uses a ` RLock ` to
172186synchronize the access to the standard output.
@@ -181,3 +195,115 @@ In a future `multiprocessing` may be re-enabled again.
181195That's the main reason of using ` sharer ` and ` namespace ` : if you use
182196them in your classes your code will support any concurrency model out of
183197the box.
198+
199+ ## Caveats on using ` multiprocessing ` ** within** a module/plugin
200+
201+ ` byexample ` does not impose any restriction on how * your* module/plugin
202+ may use or not ` multithreading ` and/or ` multiprocessing ` ** internally** .
203+
204+ How ` --jobs ` works is ** independent** of it.
205+
206+ However, using ` multiprocessing ` ** within** a module/plugin has some
207+ caveats.
208+
209+ When ` multiprocessing.Process ` (or similar) is used, the main Python
210+ process (` byexample ` ) spawns a fresh Python process to run whatever you
211+ wanted in parallel.
212+
213+ Take the following ` Concern ` that runs a class' method in background
214+ while ` byexample ` is executing an example:
215+
216+ ``` python
217+ >> > from byexample.concern import Concern
218+ >> > import multiprocessing
219+
220+ >> > class Some (Concern ):
221+ ... target = ' some'
222+ ...
223+ ... @ classmethod
224+ ... def watch_in_bg (cls , num ):
225+ ... # this will be executed in background, in parallel
226+ ... pass
227+ ...
228+ ... def start_example (self , * args ):
229+ ... self .child = multiprocessing.Process(
230+ ... target = Some.watch_in_bg,
231+ ... args = (42 ,)
232+ ... )
233+ ... self .child.start() # This will fail!!
234+ ...
235+ ... def end_example (self , * args ):
236+ ... self .child.join()
237+ ```
238+
239+ Why it will fail?
240+
241+ This child fresh process will not have the modules/plugins that
242+ ` byexample ` loaded dynamically so it will likely fail even before
243+ executing the class' method ` watch_in_bg ` because the module where
244+ ` Some.watch_in_bg ` lives is not loaded.
245+
246+ You may see an error like this:
247+
248+ ``` python
249+ Traceback (most recent call last):
250+ File " <string>" , line 1 , in < module>
251+ File " <...>/multiprocessing/spawn.py" , line 116 , in spawn_main
252+ exitcode = _main(fd, parent_sentinel)
253+ File " <...>/multiprocessing/spawn.py" , line 126 , in _main
254+ self = reduction.pickle.load(from_parent)
255+ ModuleNotFoundError : No module named ' <...>'
256+ ```
257+
258+ > Note: calling ` multiprocessing.Process ` will not fail if you are in
259+ > Linux, however you should not develop your plugin under that
260+ > assumption. Keep reading!
261+
262+ Since ` 10.5.1 ` , ` byexample ` offers you a mechanism to call
263+ ` multiprocessing.Process ` safely.
264+
265+ You need to wrap the function with ` prepare_subprocess_call ` :
266+
267+ ``` python
268+ >> > from byexample.concern import Concern
269+ >> > import multiprocessing
270+
271+ >> > class Some (Concern ):
272+ ... target = ' some'
273+ ...
274+ ... def __init__ (self , prepare_subprocess_call , ** kargs ):
275+ ... # keep a reference to this function helper
276+ ... self .prepare_subprocess_call = prepare_subprocess_call
277+ ...
278+ ... @ classmethod
279+ ... def watch_in_bg (cls , num ):
280+ ... # this will be executed in background, in parallel
281+ ... pass
282+ ...
283+ ... def start_example (self , * args ):
284+ ... # prepare_subprocess_call takes the 'target' function
285+ ... # and an optional 'args' and 'kwargs' arguments
286+ ... # like multiprocessing.Process does.
287+ ... #
288+ ... # it will return a dictionary that be unpacked
289+ ... # with the double '**' directly into multiprocessing.Process
290+ ... # call
291+ ... self .child = multiprocessing.Process(
292+ ... ** self .prepare_subprocess_call(
293+ ... target = Some.watch_in_bg,
294+ ... args = (42 ,)
295+ ... )
296+ ... )
297+ ... self .child.start() # Start the child process as usual
298+ ...
299+ ... def end_example (self , * args ):
300+ ... self .child.join()
301+ ```
302+
303+ I wrote a
304+ [ blog post] ( https://book-of-gehn.github.io/articles/2022/03/06/Multiprocessing-Spawn-of-Dynamically-Imported-Code.html )
305+ about the issues using ` multiprocessing ` with dynamically imported code.
306+ If you want to see the dirty details behind ` prepare_subprocess_call ` ,
307+ you can check the [ commit b263ba76] (
308+ https://github.com/byexamples/byexample/commit/b263ba76271e447a2faed6f0517f71b74d96ab81
309+ ).
0 commit comments