Application

The propose of this app is that base on two image , we'll be able to create a new image with some AI artwork.

The first image will be the base image, meanwhile the second image is going to be the img that will give the style that we want to apply .

Finally, we will have our third image that will be generated, which we initialiazed with random color values. This image will change as we minimize the content and style of loss functions.

We define all the libraries that we are going to use.

import numpy as np
import pandas as pd
from PIL import Image
from keras import backend as K
from keras.preprocessing.image import load_img, img_to_array
from keras.applications import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input
from scipy.optimize import fmin_l_bfgs_b
import time

Now we need to define the paths of the images, and then we are going to initialize gIm0 as flota64 for optimization purpose later on.

And also we keep cImArr and sImArr as float32 to avoid GPU memory erros.

#Path images

basePath = '/home/amiko/Desktop/ImagesKeras/base.jpg'   #path base image
filterPath = '/home/amiko/Desktop/ImagesKeras/filter.jpg'  #path filter image
resultPath = '/home/amiko/Desktop/ImagesKeras/final.jpg' #path result image

# Image processing
targetHeight = 512
targetWidth = 512
targetSize = (targetHeight, targetWidth)

baseImageOrig = Image.open(basePath)
baseImageSizeOrig = baseImageOrig.size
baseImage = load_img(path=basePath, target_size=targetSize)
baseImArr = img_to_array(baseImage)
baseImArr = K.variable(preprocess_input(np.expand_dims(baseImArr, axis=0)), dtype='float32')

filterImage = load_img(path=filterPath, target_size=targetSize)
filterImArr = img_to_array(filterImage)
filterImArr = K.variable(preprocess_input(np.expand_dims(filterImArr, axis=0)), dtype='float32')

gIm0 = np.random.randint(256, size=(targetWidth, targetHeight, 3)).astype('float64')
gIm0 = preprocess_input(np.expand_dims(gIm0, axis=0))

gImPlaceholder = K.placeholder(shape=(1, targetWidth, targetHeight, 3))

Content Loss

This is in charge to make sure that the generated image x retains some of the global characterisitcs of the content image, p.

Lets image, we want to make sure that the new image look like our base image, This means that we want to respect some characteristics so it could be recognizable. To achieve this, the contect loss function is defined as the mean squared error between the feature representation of p and x, respectively, at a given layer l.

def get_feature_reps(x, layer_names, model):
    featMatrices = []
    for ln in layer_names:
        selectedLayer = model.get_layer(ln)
        featRaw = selectedLayer.output
        featRawShape = K.shape(featRaw).eval(session=tf_session)
        N_l = featRawShape[-1]
        M_l = featRawShape[1]*featRawShape[2]
        featMatrix = K.reshape(featRaw, (M_l, N_l))
        featMatrix = K.transpose(featMatrix)
        featMatrices.append(featMatrix)
    return featMatrices

def get_content_loss(F, P):
    cLoss = 0.5*K.sum(K.square(F - P))
return cLoss

Style Loss

The style loss help us to preserve stylistic characteristics of the style image, base. Rather than using the difference between feature representations, in this case we use a Gram matrix form selected layers.

The Gram matrix is a square matrix that contains the dot products between each vectorized filter in layer l. The Gram matrix can therefore be thought of as non-normalized correlation matrix for filters in layer l.

def get_Gram_matrix(F):
    G = K.dot(F, K.transpose(F))
    return G

Ascending layers in most convolutional networks such as VGG have increasingly larger receptive fields. As this receptive field grows, more large-scale characteristics of the input image are preserved. Because of this, multiple layers should be selected for “style” to incorporate both local and global stylistic qualities. To create a smooth blending between these different layers, we can assign a weight w to each layer.

def get_style_loss(ws, Gs, As):
    sLoss = K.variable(0.)
    for w, G, A in zip(ws, Gs, As):
        M_l = K.int_shape(G)[1]
        N_l = K.int_shape(G)[0]
        G_gram = get_Gram_matrix(G)
        A_gram = get_Gram_matrix(A)
        sLoss+= w*0.25*K.sum(K.square(G_gram - A_gram))/ (N_l**2 * M_l**2)
return sLoss

Lastly, we just need to assign weighting coefficients to each of the content and style loss respectively.

def get_total_loss(gImPlaceholder, alpha=1.0, beta=10000.0):
    F = get_feature_reps(gImPlaceholder, layer_names=[cLayerName], model=gModel)[0]
    Gs = get_feature_reps(gImPlaceholder, layer_names=sLayerNames, model=gModel)
    contentLoss = get_content_loss(F, P)
    styleLoss = get_style_loss(ws, Gs, As)
    totalLoss = alpha*contentLoss + beta*styleLoss
return totalLoss

Implementation

For changing our generated image to minimize the loss function, we have to define two more functions to use scipy and the keras backend. First, a function to calculate the total loss, and second, a function to calculate the gradient. Both of these get fed as input to a scipy optimization function as the objective and gradient functions respectively. Here, we use the limited-memory BFGS algorithm.

For each of the content and style images, we extract the feature representations to construct P and A (for each selected style layer), and weight the style layers uniformly. In practice, using > 500 iterations of L-BFGS-B typically creates convincing visualizations.

Lets clarify that there are many parameters that could be change , depends on the requirements or also in our knowledge that we have, we could modify an create more layers, or change float, etc.

Last updated