Skip to content

Commit c4bdbaf

Browse files
committed
fix docstrings
1 parent 75cb555 commit c4bdbaf

File tree

5 files changed

+70
-56
lines changed

5 files changed

+70
-56
lines changed

Project.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "ParallelUtilities"
22
uuid = "fad6cfc8-4f83-11e9-06cc-151124046ad0"
33
authors = ["Jishnu Bhattacharya <jishnuonline@gmail.com>"]
4-
version = "0.7.4"
4+
version = "0.7.5"
55

66
[deps]
77
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"

README.md

+25-9
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,22 @@ julia> pmapsum(x -> ones(2).*myid(), 1:nworkers())
4141
5.0
4242
```
4343

44+
# Performance
45+
46+
The `pmapreduce`-related functions are expected to be more performant than `@distributed` for loops. As an example, running the following on a Slurm cluster using 2 nodes with 28 cores on each leads to
47+
48+
```julia
49+
julia> @time @distributed (+) for i=1:nworkers()
50+
ones(10_000, 1_000)
51+
end;
52+
22.355047 seconds (7.05 M allocations: 8.451 GiB, 6.73% gc time)
53+
54+
julia> @time pmapsum(x -> ones(10_000, 1_000), 1:nworkers());
55+
2.672838 seconds (52.83 k allocations: 78.295 MiB, 0.53% gc time)
56+
```
57+
58+
The difference becomes more apparent as larger data needs to be communicated across workers. This is because `ParallelUtilities.pmapreduce*` perform local reductions on each node before communicating across nodes.
59+
4460
# Usage
4561

4662
The package splits up a collection of ranges into subparts of roughly equal length, so that all the cores are approximately equally loaded. This is best understood using an example: let's say that we have a function `f` that is defined as
@@ -82,7 +98,7 @@ The first six processors receive 4 tuples of parameters each and the final four
8298

8399
The package provides versions of `pmap` with an optional reduction. These differ from the one provided by `Distributed` in a few key aspects: firstly, the iterator product of the argument is what is passed to the function and not the arguments by elementwise, so the i-th task will be `Iterators.product(args...)[i]` and not `[x[i] for x in args]`. Specifically the second set of parameters in the example above will be `(2,2,3)` and not `(2,3,4)`.
84100

85-
Secondly, the iterator is passed to the function in batches and not elementwise, and it is left to the function to iterate over the collection. Thirdly, the tasks are passed on to processors sorted by rank, so the first task is passed to the first processor and the last to the last active worker. The tasks are also approximately evenly distributed across processors. The function `pmapbatch_elementwise` is also exported that passes the elements to the function one-by-one as unwrapped tuples. This produces the same result as `pmap` where each worker is assigned batches of approximately equal sizes taken from the iterator product.
101+
Secondly, the iterator is passed to the function in batches and not elementwise, and it is left to the function to iterate over the collection. Thirdly, the tasks are passed on to processors sorted by rank, so the first task is passed to the first processor and the last to the last active worker. The tasks are also approximately evenly distributed across processors. The exported function `pmapbatch_elementwise` passes the elements to the function one-by-one as splatted tuples. This produces the same result as `pmap` for a single range as the argument.
86102

87103
### pmapbatch and pmapbatch_elementwise
88104

@@ -106,7 +122,7 @@ julia> Tuple(p)
106122

107123
### pmapsum and pmapreduce
108124

109-
Often a parallel execution is followed by a reduction (eg. a sum over the results). A reduction may be commutative (in which case the order of results do not matter), or non-commutative (in which the order does matter). There are two functions that are exported that carry out these tasks: `pmapreduce_commutative` and `pmapreduce`, where the former does not preserve ordering and the latter does. For convenience, the package also provides the function `pmapsum` that chooses `sum` as the reduction operator. The map-reduce operation is similar in many ways to the distributed `for` loop provided by julia, but the main difference is that the reduction operation is not binary for the functions in this package (eg. we need `sum` and not `(+)`to add the results). There is also the difference as above that the function gets the parameters in batches, with functions having the suffix `_elementwise` taking on parameters individually as unwrapped tuples as above. The function `pmapreduce` does not take on parameters elementwise at this point, although this might be implemented in the future.
125+
Often a parallel execution is followed by a reduction (eg. a sum over the results). A reduction may be commutative (in which case the order of results do not matter), or non-commutative (in which the order does matter). There are two functions that are exported that carry out these tasks: `pmapreduce_commutative` and `pmapreduce`, where the former does not preserve ordering and the latter does. For convenience, the package also provides the function `pmapsum` that chooses `sum` as the reduction operator. The map-reduce operation is similar in many ways to the distributed `for` loop provided by julia, but the main difference is that the reduction operation is not binary for the functions in this package (eg. we need `sum` and not `(+)`to add the results). There is also the difference as above that the function gets the parameters in batches, with functions having the suffix `_elementwise` taking on parameters individually as splatted `Tuple`s. The function `pmapreduce` does not take on parameters elementwise at this point, although this might be implemented in the future.
110126

111127
As an example, to sum up a list of numbers in parallel we may call
112128
```julia
@@ -137,7 +153,7 @@ julia> workers()
137153
2
138154
3
139155

140-
# The signature is pmapreduce(fmap,freduce,iterable)
156+
# The signature is pmapreduce(fmap, freduce, range_or_tuple_of_ranges)
141157
julia> pmapreduce(x -> ones(2).*myid(), x -> hcat(x...), 1:nworkers())
142158
2×2 Array{Float64,2}:
143159
2.0 3.0
@@ -146,7 +162,7 @@ julia> pmapreduce(x -> ones(2).*myid(), x -> hcat(x...), 1:nworkers())
146162

147163
The functions `pmapreduce` produces the same result as `pmapreduce_commutative` if the reduction operator is commutative (ie. the order of results received from the children workers does not matter).
148164

149-
The function `pmapsum` sets the reduction operator to be a sum.
165+
The function `pmapsum` sets the reduction function to `sum`.
150166

151167
```julia
152168
julia> sum(workers())
@@ -162,13 +178,13 @@ julia> pmapsum(x -> ones(2).*myid(), 1:nworkers())
162178
It is possible to specify the return types of the map and reduce operations in these functions. To specify the return types use the following variants:
163179

164180
```julia
165-
# Signature is pmapreduce(fmap, Tmap, freduce, Treduce, iterators)
181+
# Signature is pmapreduce(fmap, Tmap, freduce, Treduce, range_or_tuple_of_ranges)
166182
julia> pmapreduce(x -> ones(2).*myid(), Vector{Float64}, x -> hcat(x...), Matrix{Float64}, 1:nworkers())
167183
2×2 Array{Float64,2}:
168184
2.0 3.0
169185
2.0 3.0
170186

171-
# Signature is pmapsum(fmap, Tmap, iterators)
187+
# Signature is pmapsum(fmap, Tmap, range_or_tuple_of_ranges)
172188
julia> pmapsum(x -> ones(2).*myid(), Vector{Float64}, 1:nworkers())
173189
2-element Array{Float64,1}:
174190
5.0
@@ -244,7 +260,7 @@ julia> collect(ps)
244260
where the object loops over values of `(x,y,z)`, and the values are sorted in reverse lexicographic order (the last index increases the slowest while the first index increases the fastest). The ranges roll over as expected. The tasks are evenly distributed with the remainders being split among the first few processors. In this example the first six processors receive 4 tasks each and the last four receive 3 each. We can see this by evaluating the length of the `ProductSplit` operator on each processor
245261

246262
```julia
247-
julia> Tuple(length(ProductSplit((xrange,yrange,zrange),10,i)) for i=1:10)
263+
julia> Tuple(length(ProductSplit((xrange,yrange,zrange), 10, i)) for i=1:10)
248264
(4, 4, 4, 4, 4, 4, 3, 3, 3, 3)
249265
```
250266

@@ -268,11 +284,11 @@ julia> xrange_long,yrange_long,zrange_long = 1:3000,1:3000,1:3000
268284

269285
julia> params_long = (xrange_long,yrange_long,zrange_long);
270286

271-
julia> ps_long = ProductSplit(params_long,10,4)
287+
julia> ps_long = ProductSplit(params_long, 10, 4)
272288
ProductSplit{Tuple{Int64,Int64,Int64},3,UnitRange{Int64}}((1:3000, 1:3000, 1:3000), (0, 3000, 9000000), 10, 4, 8100000001, 10800000000)
273289

274290
# Evaluate length using random ranges to avoid compiler optimizations
275-
julia> @btime length(p) setup=(n=rand(3000:4000);p=ProductSplit((1:n,1:n,1:n),200,2));
291+
julia> @btime length(p) setup = (n = rand(3000:4000); p = ProductSplit((1:n,1:n,1:n), 200, 2));
276292
2.674 ns (0 allocations: 0 bytes)
277293

278294
julia> @btime $ps_long[1000000] # also fast, does not iterate

src/mapreduce.jl

+13-16
Original file line numberDiff line numberDiff line change
@@ -305,10 +305,9 @@ or iterate over one to access individual tuples of integers.
305305
306306
The reduction function `freduce` is expected to accept a collection of mapped values.
307307
Note that this is different from the standard `mapreduce` operation in julia that
308-
expects a binary reduction operator. For example, `fmap` should be
309-
`sum` and not `+`. In case a binary operator `op` is to be passed, one may wrap it in
310-
an anonymous function as `x->reduce(op,x)`, or as `x->op(x...)` in case the operator
311-
accepts multiple arguments that are processed in pairs.
308+
expects a binary reduction operator. For example, `freduce` should be
309+
`sum` and not `+`. In case a binary operator `op` is to be used in the reduction, one may pass it
310+
as `Base.splat(op)` or wrap it in an anonymous function as `x -> op(x...)`.
312311
313312
Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are
314313
passed on to the mapping function `fmap`.
@@ -368,10 +367,9 @@ results obtained may be incorrect otherwise.
368367
369368
The reduction function `freduce` is expected to accept a collection of mapped values.
370369
Note that this is different from the standard `mapreduce` operation in julia that
371-
expects a binary reduction operator. For example, `fmap` should be
372-
`sum` and not `+`. In case a binary operator `op` is to be passed, one may wrap it in
373-
an anonymous function as `x->reduce(op,x)`, or as `x->op(x...)` in case the operator
374-
accepts multiple arguments that are processed in pairs.
370+
expects a binary reduction operator. For example, `freduce` should be
371+
`sum` and not `+`. In case a binary operator `op` is to be used in the reduction, one may pass it
372+
as `Base.splat(op)` or wrap it in an anonymous function as `x -> op(x...)`.
375373
376374
Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are
377375
passed on to the mapping function `fmap`.
@@ -513,9 +511,8 @@ or iterate over one to access individual tuples of integers.
513511
The reduction function `freduce` is expected to accept a collection of mapped values.
514512
Note that this is different from the standard `mapreduce` operation in julia that
515513
expects a binary reduction operator. For example, `fmap` should be
516-
`sum` and not `+`. In case a binary operator `op` is to be passed, one may wrap it in
517-
an anonymous function as `x->reduce(op,x)`, or as `x->op(x...)` in case the operator
518-
accepts multiple arguments that are processed in pairs.
514+
`sum` and not `+`. In case a binary operator `op` is to be used in the reduction, one may pass it
515+
as `Base.splat(op)` or wrap it in an anonymous function as `x -> op(x...)`.
519516
520517
Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are
521518
passed on to the mapping function `fmap`.
@@ -573,10 +570,9 @@ part of the entire parameter space sequentially. The argument
573570
`iterators` needs to be a strictly-increasing range,
574571
or a tuple of such ranges. The outer product of these ranges forms the
575572
entire range of parameters that is processed in batches on
576-
the workers.
577-
578-
Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are
573+
the workers. Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are
579574
passed on to the function `f`.
575+
580576
Additionally, the number of workers to be used may be specified using the
581577
keyword argument `num_workers`. In this case the first `num_workers` available
582578
workers are used in the evaluation.
@@ -597,7 +593,7 @@ function pmapbatch(f::Function, iterators::Tuple, args...;
597593
end
598594

599595
function pmapbatch(f::Function, ::Type{T}, iterators::Tuple, args...;
600-
num_workers = nworkersactive(iterators),kwargs...) where {T}
596+
num_workers = nworkersactive(iterators), kwargs...) where {T}
601597

602598
procs_used = workersactive(iterators)
603599
if num_workers < length(procs_used)
@@ -634,7 +630,8 @@ part of the entire parameter space sequentially. The argument
634630
`iterators` needs to be a strictly-increasing range of intergers,
635631
or a tuple of such ranges. The outer product of these ranges forms the
636632
entire range of parameters that is processed elementwise by the function `f`.
637-
Given `n` ranges in `iterators`, the function `f` will receive `n` integers
633+
The individual tuples are splatted and passed as arguments to `f`.
634+
Given `n` ranges in `iterators`, the function `f` will receive `n` values
638635
at a time.
639636
640637
Arguments `mapargs` and keyword arguments `mapkwargs` — if provided — are

src/productsplit.jl

+29-28
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
"""
2+
ParallelUtilities.AbstractConstrainedProduct{T,N}
3+
4+
Supertype of [`ParallelUtilities.ProductSplit`](@ref) and [`ParallelUtilities.ProductSection`](@ref).
5+
"""
16
abstract type AbstractConstrainedProduct{T,N} end
27
Base.eltype(::AbstractConstrainedProduct{T}) where {T} = T
38
Base.ndims(::AbstractConstrainedProduct{<:Any,N}) where {N} = N
@@ -132,7 +137,7 @@ ProductSplit(::Tuple{},::Integer,::Integer) = throw(ArgumentError("Need at least
132137
"""
133138
ProductSection(iterators::Tuple{Vararg{AbstractRange}}, inds::AbstractUnitRange)
134139
135-
Construct a `ProductSection` iterator that represents a view of the outer product
140+
Construct a `ProductSection` iterator that represents a 1D view of the outer product
136141
of the ranges provided in `iterators`, with the range of indices in the view being
137142
specified by `inds`.
138143
@@ -147,7 +152,7 @@ julia> collect(p)
147152
(1, 6)
148153
(2, 6)
149154
150-
julia> collect(p) == collect(Iterators.product(1:3,4:6))[5:8]
155+
julia> collect(p) == collect(Iterators.product(1:3, 4:6))[5:8]
151156
true
152157
```
153158
"""
@@ -206,7 +211,7 @@ outer product of the iterators.
206211
207212
# Examples
208213
```jldoctest
209-
julia> ps = ProductSplit((1:5,2:4,1:3),7,1);
214+
julia> ps = ProductSplit((1:5, 2:4, 1:3), 7, 1);
210215
211216
julia> ParallelUtilities.childindex(ps, 6)
212217
(1, 2, 1)
@@ -241,9 +246,9 @@ given an index of a `AbstractConstrainedProduct`.
241246
242247
# Examples
243248
```jldoctest
244-
julia> ps = ProductSplit((1:5,2:4,1:3), 7, 3);
249+
julia> ps = ProductSplit((1:5, 2:4, 1:3), 7, 3);
245250
246-
julia> cinds = ParallelUtilities.childindexshifted(ps,3)
251+
julia> cinds = ParallelUtilities.childindexshifted(ps, 3)
247252
(2, 1, 2)
248253
249254
julia> getindex.(ps.iterators, cinds) == ps[3]
@@ -341,12 +346,11 @@ end
341346
ParallelUtilities.nelements(ps::AbstractConstrainedProduct; dim::Integer)
342347
ParallelUtilities.nelements(ps::AbstractConstrainedProduct, dim::Integer)
343348
344-
Compute the number of unique values in the element number `dim` of the tuples
345-
that are returned when `ps` is iterated over.
349+
Compute the number of unique values in the section of the `dim`-th range contained in `ps`.
346350
347351
# Examples
348352
```jldoctest
349-
julia> ps = ProductSplit((1:5,2:4,1:3),7,3);
353+
julia> ps = ProductSplit((1:5, 2:4, 1:3), 7, 3);
350354
351355
julia> collect(ps)
352356
7-element Array{Tuple{Int64,Int64,Int64},1}:
@@ -358,14 +362,14 @@ julia> collect(ps)
358362
(5, 2, 2)
359363
(1, 3, 2)
360364
361-
julia> ParallelUtilities.nelements(ps,3)
362-
2
365+
julia> ParallelUtilities.nelements(ps, 1)
366+
5
363367
364-
julia> ParallelUtilities.nelements(ps,2)
368+
julia> ParallelUtilities.nelements(ps, 2)
365369
3
366370
367-
julia> ParallelUtilities.nelements(ps,1)
368-
5
371+
julia> ParallelUtilities.nelements(ps, 3)
372+
2
369373
```
370374
"""
371375
nelements(ps::AbstractConstrainedProduct; dim::Integer) = nelements(ps,dim)
@@ -402,8 +406,7 @@ end
402406
maximum(ps::AbstractConstrainedProduct; dim::Integer)
403407
maximum(ps::AbstractConstrainedProduct, dim::Integer)
404408
405-
Compute the maximum value of the range number `dim` that is
406-
contained in `ps`.
409+
Compute the maximum value of the section of the `dim`-th range contained in `ps`.
407410
408411
# Examples
409412
```jldoctest
@@ -457,12 +460,11 @@ end
457460
minimum(ps::AbstractConstrainedProduct; dim::Integer)
458461
minimum(ps::AbstractConstrainedProduct, dim::Integer)
459462
460-
Compute the minimum value of the range number `dim` that is
461-
contained in `ps`.
463+
Compute the minimum value of the section of the `dim`-th range contained in `ps`.
462464
463465
# Examples
464466
```jldoctest
465-
julia> ps = ProductSplit((1:2,4:5), 2, 1);
467+
julia> ps = ProductSplit((1:2, 4:5), 2, 1);
466468
467469
julia> collect(ps)
468470
2-element Array{Tuple{Int64,Int64},1}:
@@ -512,12 +514,11 @@ end
512514
extrema(ps::AbstractConstrainedProduct; dim::Integer)
513515
extrema(ps::AbstractConstrainedProduct, dim::Integer)
514516
515-
Compute the minimum and maximum of the range number `dim` that is
516-
contained in `ps`.
517+
Compute the `extrema` of the section of the `dim`-th range contained in `ps`.
517518
518519
# Examples
519520
```jldoctest
520-
julia> ps = ProductSplit((1:2,4:5), 2, 1);
521+
julia> ps = ProductSplit((1:2, 4:5), 2, 1);
521522
522523
julia> collect(ps)
523524
2-element Array{Tuple{Int64,Int64},1}:
@@ -574,11 +575,11 @@ end
574575
"""
575576
extremadims(ps::AbstractConstrainedProduct)
576577
577-
Compute the extrema of all the ranges contained in `ps`.
578+
Compute the extrema of the sections of all the ranges contained in `ps`.
578579
579580
# Examples
580581
```jldoctest
581-
julia> ps = ProductSplit((1:2,4:5), 2, 1);
582+
julia> ps = ProductSplit((1:2, 4:5), 2, 1);
582583
583584
julia> collect(ps)
584585
2-element Array{Tuple{Int64,Int64},1}:
@@ -601,9 +602,9 @@ _extremadims(::AbstractConstrainedProduct, ::Integer, ::Tuple{}) = ()
601602
602603
Return the reverse-lexicographic extrema of values taken from
603604
ranges contained in `ps`, where the pairs of ranges are constructed
604-
by concatenating each dimension with the last one.
605+
by concatenating the ranges along each dimension with the last one.
605606
606-
For two ranges this simply returns ([first(ps)],[last(ps)]).
607+
For two ranges this simply returns `([first(ps)], [last(ps)])`.
607608
608609
# Examples
609610
```jldoctest
@@ -791,7 +792,7 @@ the ranges in `iterators`.
791792
792793
# Examples
793794
```jldoctest
794-
julia> iters = (1:10,4:6,1:4);
795+
julia> iters = (1:10, 4:6, 1:4);
795796
796797
julia> ps = ProductSplit(iters, 5, 2);
797798
@@ -833,7 +834,7 @@ is not found.
833834
834835
# Examples
835836
```jldoctest
836-
julia> ps = ProductSplit((1:3,4:5:20), 3, 2);
837+
julia> ps = ProductSplit((1:3, 4:5:20), 3, 2);
837838
838839
julia> collect(ps)
839840
4-element Array{Tuple{Int64,Int64},1}:
@@ -902,7 +903,7 @@ resulting `ProductSection` will be the same as in `ps`.
902903
903904
# Examples
904905
```jldoctest
905-
julia> ps = ProductSplit((1:5,2:4,1:3),7,3);
906+
julia> ps = ProductSplit((1:5, 2:4, 1:3), 7, 3);
906907
907908
julia> collect(ps)
908909
7-element Array{Tuple{Int64,Int64,Int64},1}:

src/utils.jl

+2-2
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ workers are chosen.
2424
gethostnames(procs = workers())
2525
2626
Return the hostname of each worker in `procs`. This is obtained by evaluating
27-
`Libc.gethostname()` on each worker.
27+
`Libc.gethostname()` on each worker asynchronously.
2828
"""
2929
function gethostnames(procs = workers())
30-
hostnames = Vector{String}(undef,length(procs))
30+
hostnames = Vector{String}(undef, length(procs))
3131
@sync for (ind,p) in enumerate(procs)
3232
@async hostnames[ind] = @fetchfrom p Libc.gethostname()
3333
end

0 commit comments

Comments
 (0)