Texture synthesis is an image-generation technique commonly used to produce high-quality textures when rendering synthetic graphics. In this project we used multithreading with OpenMP in C++ to parallelize an algorithm called non-parametric sampling for texture synthesis.
Define texture as some visual pattern of an infinite 2D plane. Texture synthesis is the process of taking a finite fixed shape sample from a texture to generate other samples of a different dimension from the given texture. Potential graphic applications of texture synthesis include image de-noising, occlusion fill-in, image compression, etc. Below are some examples of inputs and respective sample outputs of texture synthesis. The first row represents the input (a fixed n by n sample), the bottom rows are the output generated by the program.
Our parallel approach is based on the sequential algorithm proposed in the paper Texture Synthesis by Non-parametric Sampling by Alexei A. Efros and Thomas K. Leung of UC Berkeley.
We tried to mainly parallelize the function synthesize
with OpenMP, which supports multi-threaded, shared address space parallelism.
We first did benchmark experiments with the original serial implementation and timed different parts of the program. We observed that the performance bottleneck is computing the squared distance between the result window and all windows in the sample image.
Our key observation is that there’s no data dependency between the values between each sample window - computing the distance of the next [sample_window, result_window] pair does not depend on previously-generated values. Thus, we parallelize the for-loop to iterate through all sample windows.
To optimize the program performance, we applied the following techniques:
First, clone the repository:
git https://github.com/rosieswj/ParallelTextureSynthesis.git
Open local repository. Run make
under folder texturesyn. This will generate an executable texturesyn
capatible with the following commands:
./texturesyn -s SFILE -o OFILE -w WINDOW -r RADIUS [-t THD] [-I]
-s SFILE Sample file path
-o OFILE Output file path (no file extension required)
-w WINDOW Window size
-r RADIUS Output radius
-t THD Number of threads
-I Instrument simulation activities
-h Print this message
For example,
./texturesyn -s src/sample1.ttr -o out/sample1 -w 10 -r 100 -I -t 8
will run the program with 8 threads and generate an output image sample1_10_100.ppm
in ./out/ folder.
Run make clean
will clean up the executables as well as output images in ./out/.
Run make
under folder texturesynOriginSerial will generate an executable for the original serial implementaion adopted from here.
A detailed description of our benchmark experiments can be found in our final report. There are 3 versions of code we used to generate measurement in benchmark experiments.
Version | Description | Command |
---|---|---|
Seq_original | Original serial implementation | Run executable ./texturesynOrigin |
Seq_improved | Our improved serial implementation | Run executable ./texturesyn with t=1 |
Parallel | Parallel implementation with OpenMP | Run executable ./texturesyn with t=T, T = {2,4,8} |
Compared with the original sequential implementation, we were able to achieve an average of 27x speedup across different samples when running the parallel program with 8 threads.
Compared with our improved sequential implementation, we were able to achieve an average of 6.78x speedup across different samples when running the parallel program with 8 threads.