Parallelizing Julia set
In this section I will describe one of the two projects you can work on this afternoon. The project is the mathematical problem to compute Julia set – no relation to Julia language! A Julia set is defined as a set of points on the complex plane that remain bound under infinite recursive transformation $f(z)$. We will use the traditional form $f(z)=z^2+c$, where $c$ is a complex constant. Here is our algorithm:
- pick a point $z_0\in\mathbb{C}$
- compute iterations $z_{i+1}=z_i^2+c$ until $|z_i|>4$ (arbitrary fixed radius; here $c$ is a complex constant)
- store the iteration number $\xi(z_0)$ at which $z_i$ reaches the circle $|z|=4$
- limit max iterations at 255
4.1 if $\xi(z_0)=255$, then $z_0$ is a stable point
4.2 the quicker a point diverges, the lower its $\xi(z_0)$ is - plot $\xi(z_0)$ for all $z_0$ in a rectangular region $-1<=\mathfrak{Re}(z_0)<=1$, $-1<=\mathfrak{Im}(z_0)<=1$
We should get something conceptually similar to this figure (here $c = 0.355 + 0.355i$; we’ll get drastically different fractals for different values of $c$):
Below is the serial code juliaSetSerial.jl
. First, you need to install the required packages:
] add ProgressMeter
] add NetCDF
Next, let’s study the code:
using ProgressMeter, NetCDF
function pixel(i, j, width, height, c, zoomOut)
z = (2*(j-0.5)/width-1)+(2*(i-0.5)/height-1)im # rescale to -1:1 in the complex plane
z *= zoomOut
for i = 1:255
z = z^2 + c
if abs(z) >= 4
return i
end
end
return 255
end
n = Int(8e3) # plot size
height, width = n, n
c, zoomOut = 0.355 + 0.355im, 1.2
println("Computing Julia set ...")
data = zeros(Float32, height, width); # local array
@showprogress for i in 1:height, j in 1:width
data[i,j] = pixel(i, j, width, height, c, zoomOut)
end
# println("Very slow plotting ...")
# using Plots
# gr() # initialize the gr backend
# file_name = "$(height)x$(width)_$(c.re)_$(c.im)"
# png(heatmap(data, size=(width,height), color=:gist_ncar), file_name) # save to PNG
# println("On-screen b/w plotting ... slow too")
# using ImageView
# imshow(data)
println("Writing NetCDF ...")
filename = "test.nc"
isfile(filename) && rm(filename) # compact if statement
nccreate(filename, "stability", "x", collect(1:height), "y",
collect(1:width), t=NC_FLOAT, mode=NC_NETCDF4, compress=9);
ncwrite(data, filename, "stability");
Let’s run this code. It’ll produce the file test.nc
that you can download to your computer and visualize with ParaView
or other visualization tool.
Exercise 11
- Compare the expected and actual file sizes.
- Try other parameter values:
c, zoomOut = 0.355 + 0.355im, 1.2 # the default one: spirals c, zoomOut = 1.2exp(1.1π*im), 1 # original textbook example c, zoomOut = -0.4 - 0.59im, 1.5 # denser spirals c, zoomOut = 1.34 - 0.45im, 1.8 # beans c, zoomOut = 0.34 -0.05im, 1.2 # connected spiral boots
- You can also try increasing problem sizes up from $8000^2$. Will you have enough physical memory for $16000^2$? How does this affect the runtime?
Parallelizing
How would we parallelize this problem? We have a large array, so we can use DistributedArrays and compute it in parallel. Here are the steps:
- Some functions (packages) should be defined (loaded) on all processes.
data
array should be distributed.- You need
fillLocalBlock(data, width, height, c, zoomOut)
to compute local pieces ofdata
in parallel. If you don’t know where to start in this project, begin with looking at the complete example withfillLocalBlock()
from the previous section. - You can replace
@showprogress for i in 1:height, j in 1:width
data[i,j] = pixel(i, j, width, height, c, zoomOut)
with
data = dzeros(Float32, height, width); # distributed 2D array of 0's
@time @sync for i in workers()
@spawnat i fillLocalBlock(data, width, height, c, zoomOut)
- Why do we need
@sync
in the previousfor
block? - To the best of my knowledge, NetCDF’s
ncwrite()
is serial in Julia. Is there a parallel version of NetCDF for Julia? If not, then unfortunately we will have to use serial NetCDF. How do we do this with distributeddata
? - Time only the
for
loop in computing the array. Is your parallel code faster?
Results
Finally, here are my timings on Cassiopeia:
Code | Time |
---|---|
julia juliaSetSerial.jl (serial runtime) |
43.1s 41.4s 41.4s |
julia -p 1 juliaSetDistributedArrays.jl (on 1 worker) |
29.6s 29.6s 29.2s |
julia -p 2 juliaSetDistributedArrays.jl (on 2 workers) |
15.4s 15.6s 15.2s |