viscid.parallel module

common tools for parallel processing

viscid.parallel.chunk_list(seq, nchunks, size=None)[source]

Chunk a list

slice seq into chunks of nchunks size, seq can be a anything sliceable such as lists, numpy arrays, etc. These chunks will be ‘contiguous’, see chunk_interslice() for picking every nth element.

Parameters:size – if given, set nchunks such that chunks have about ‘size’ elements
Returns:nchunks slices of length N = (len(lst) // nchunks) or N - 1

See also

Use chunk_iterator() to chunk up iterators

Example

>>> it1, it2, it3 = chunk_list(range(8), 3)
>>> it1 == range(0, 3)  # 3 vals
True
>>> it2 == range(3, 6)  # 3 vals
True
>>> it3 == range(6, 8)  # 2 vals
True
viscid.parallel.chunk_slices(nel, nchunks, size=None)[source]

Make continuous chunks

Get the slice info (can be unpacked and passed to the slice builtin as in slice(*ret[i])) for nchunks contiguous chunks in a list with nel elements

Parameters:
  • nel – how many elements are in one pass of the original list
  • nchunks – how many chunks to make
  • size – if given, set nchunks such that chunks have about ‘size’ elements
Returns:

a list of (start, stop) tuples with length nchunks

Example

>>> sl1, sl2 = chunk_slices(5, 2)
>>> sl1 == (0, 3)  # 3 vals
True
>>> sl2 == (3, 5)  # 2 vals
True
viscid.parallel.chunk_interslices(nchunks)[source]

Make staggered chunks

Similar to chunk_slices, but pick every nth element instead of getting a contiguous patch for each chunk

Parameters:nchunks – how many chunks to make
Returns:a list of (start, stop, step) tuples with length nchunks

Example

>>> chunk_slices(2) == [(0, None, 2), (1, None, 2)]
True
viscid.parallel.chunk_sizes(nel, nchunks, size=None)[source]

For chunking up lists, how big is each chunk

Parameters:
  • nel – how many elements are in one pass of the original list
  • nchunks – is inferred from the length of iter_list
  • size – if given, set nchunks such that chunks have about ‘size’ elements
Returns:

an ndarray of the number of elements in each chunk, this should be the same for chunk_list, chunk_slices and chunk_interslices

Example

>>> nel1, nel2 = chunk_sizes(5, 2)
>>> nel1 == 2
True
>>> nel2 == 3
True
viscid.parallel.map(nr_procs, func, args_iter, args_kw=None, timeout=100000000.0, daemonic=True, threads=False, pool=None, force_subprocess=False)[source]

Just like subprocessing.map?

same as map_async(), except it waits for the result to be ready and returns it

Note

When using threads, this is WAY faster than map_async since map_async uses the builtin python ThreadPool. I have no idea why that’s slower than making threads by hand.

viscid.parallel.map_async(nr_procs, func, args_iter, args_kw=None, daemonic=True, threads=False, pool=None)[source]

Wrap python’s map_async

This has some utility stuff like star passthrough

Run func on nr_procs with arguments given by args_iter. args_iter should be an iterable of the list of arguments that can be unpacked for each invocation. kwargs are passed to func as keyword arguments

Returns:(tuple) (pool, multiprocessing.pool.AsyncResult)

Note

When using threads, this is WAY slower than map since map_async uses the builtin python ThreadPool. I have no idea why that’s slower than making threads by hand.

Note: daemonic can be set to False if one needs to spawn child
processes in func, BUT this could be vulnerable to creating an undead army of worker processes, only use this if you really really need it, and know what you’re doing

Example

>>> func = lambda i, letter: print i, letter
>>> p, r = map_async(2, func, itertools.izip(itertools.count(), 'abc'))
>>> r.get(1e8)
>>> p.join()
>>> # the following is printed from 2 processes
0 a
1 b
2 c