This project explores various papers and methods that use frequencies and filters to process images. First, we explore the use of filters to find edges. Then, we attempt to blur and sharpen images using different frequency manipulation, and use those frequencies to generate hybrid images following the approach described in the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. Finally, we explore multiresolution blending to create new images through the use of Gaussian and Laplacian stacks, following the approach described in the 1983 paper by Burt and Adelson.
Using simple finite difference kernels in the x, y direction to convolve with our cameraman image helps visualize the partial derivatives, which seemingly appear as edges. The gradient magnitude image can then be generated by combining the two partial derviative images together through squaring and rooting. More specifically, we combine the directional derivatives into a single vector and find the 2-norm like so: np.sqrt(dx ** 2 + dy ** 2)
. The final binarized gradient image was created by setting all pixels above a certain tuned threshold equal to 1 and the rest to 0, further clarifying the edges.
To find a less jagged edge definition, I applied a Gaussian filter to the image to blur some of the features, and then reapplied the previous steps. After blurring, the resulting edge image is much smoother, and produces cleaner edges. Compared to the previous unblurred image, the edges also seem to be thicker, possibly due to the filter thickening some of the edges in the image and removing some of the noise that was observed before.
To demonstrate the associativity of convolutions and filters, we first convolve the partial derivative kernels with the Gaussian filter before applying them to the image to create a derivative of Gaussian filter (DoG). The result edge image verifies that the operation is the same, and essentially generates the same edge quality.
Many times, it is possible to sharpen a blurry image by strengthening high frequency features or details in the original image. To obtain the high frequency features, we can subtract the low frequency features from the original image. The low frequency features can be found by convolving the image with a Gaussian filter. However, overly emphasizing the high frequency features can also be destructive and make the image look noisy, so the strengthening needs to be tuned using an alpha value.
In the following example, we can observe that the image does indeed seem to gradually get sharper, as the higher frequency features such as the cat's fur are more pronounced. However, when alpha reaches 16, the image appears almost noisy due to the over strengthening of the high frequency details.
Evaluating the effectiveness of this method, we attempt to resharpen an originally sharp image that was blurred using a Gaussian filter. In the original image, we can clearly see the crumbs and cracks in the cookie, and post-filtering, the cookie seems blurry as expected. Using our sharpening method, we can achieve a similar image to the original, although subpar, using an alpha value of 2. Further increases in the alpha value only yield crummier images.
Following the approach described in the SIGGRAPH 2006 paper, hybrid images can be generated by combining the high frequencies of an image with the low frequences of another. This results in the higher frequencies being more visible at closer distances, while the lower frequencies dominate at farther viewing distances. In this section, I also experiment with the effects of incorporating the color elements from each image. Although overall color blending proved to be difficult, incorporating high frequency color did help to focus the attention to the high frequency features rather than the low frequency backgrounds. If the base colors of both images were similar, then adding the low frequency color helped enhance the overall image as it helped make the images at their respective viewing distances more pronounced.
Taking a popular scene from Spider Man, I tried overlaying the two scenes where Peter is discovering his newly acquired vision. Due to the blue wall in the background, it ended up being better to remove the color entirely to not make the hand blue and confuse the subject of the image.
My favorite result, although not the most successful, was the hybrid image generated between a caterpillar and its evolved butterfly lifeform. I thought the concept would be cool to implement, and I couldn't really decide whether adding low frequency or high frequency color benefitted more, so I display all the variations. I was relatively satisfied with how the grayscale images turned out though.
The following Fourier transform graphs were produced using the combination of the low frequency butterfly image and the high frequency caterpillar image.
Next, we explore the implementation of Gaussian and Laplacian stacks to ultimately blend images together like the oraple example described in the 1983 paper. Creating the Gaussian stack, we applied a Gaussian filter to the image an additional time at each layer of the stack. This results in an accumulative blurring effect as the stack grows deeper. The Laplacian stack is calculated by taking the difference between each layer of the Gaussian stack, effectively separating out each frequency level and its corresponding features in the image. Here, we recreate the outcomes of the 1983 paper to successfully visualzie the different frequency feature levels, and reproduce the oraple.
To implement multiresolution blending of images, we utilize the Laplacian stacks and combine them using a Gaussian stack of a chosen mask. The Gaussian stack of the mask blurs the edge within the mask, and helps blend the frequencies of each image. The blending was implemented in the following: blend = left_lstack * mask_gstack + right_lstack * (1 - mask_gstack)
. Thus, the mask blurs the images' features at the edge boundaries, and allow for a clean blend between the images. In this section, I also explore the effect of grayscaling and color, but ultimately find the incorporation of color to be more appealing.
For my favorite result, the w(ashington)ampanile, the following visualizations of the Laplacian stacks were produced.