Building Convolutional Neural Networks with Tensorflow
In the past year I have also worked with Deep Learning techniques, and I would like to share with you how to make and train a Convolutional Neural Network from scratch, using tensorflow. Later on we can use this knowledge as a building block to make interesting Deep Learning applications.
The contents of this blog-post is as follows:
- Tensorflow basics:
- 1.1 Constants and Variables
- 1.2 Tensorflow Graphs and Sessions
- 1.3 Placeholders and feed_dicts
- Neural Networks in Tensorflow
- 2.1 Introduction
- 2.2 Loading in the data
- 2.3 Creating a (simple) 1-layer Neural Network:
- 2.4 The many faces of Tensorflow
- 2.5 Creating the LeNet5 CNN
- 2.6 How the parameters affect the outputsize of an layer
- 2.7 Adjusting the LeNet5 architecture
- 2.8 Impact of Learning Rate and Optimizer
- Deep Neural Networks in Tensorflow
- 3.1 AlexNet
- 3.2 VGG Net-16
- 3.3 AlexNet Performance
- Final words
1. Tensorflow basics:
Here I will give a short introduction to Tensorflow for people who have never worked with it before. If you want to start building Neural Networks immediatly, or you are already familiar with Tensorflow you can go ahead and skip to section 2. If you would like to know more about Tensorflow, you can also have a look at this repository, or the notes of lecture 1 and lecture 2 of Stanford’s CS20SI course.
1.1 Constants and Variables
The most basic units within tensorflow are Constants, Variables and Placeholders.
The difference between a tf.constant() and a tf.Variable() should be clear; a constant has a constant value and once you set it, it cannot be changed. The value of a Variable can be changed after it has been set, but the type and shape of the Variable can not be changed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
#We can create constants and variables of different types.
#However, the different types do not mix well together.
a = tf.constant(2, tf.int16)
b = tf.constant(4, tf.float32)
c = tf.constant(8, tf.float32)
d = tf.Variable(2, tf.int16)
e = tf.Variable(4, tf.float32)
f = tf.Variable(8, tf.float32)
#we can perform computations on variable of the same type: e + f
#but the following can not be done: d + e
#everything in tensorflow is a tensor, these can have different dimensions:
#0D, 1D, 2D, 3D, 4D, or nD-tensors
g = tf.constant(np.zeros(shape=(2,2), dtype=np.float32)) #does work
h = tf.zeros([11], tf.int16)
i = tf.ones([2,2], tf.float32)
j = tf.zeros([1000,4,3], tf.float64)
k = tf.Variable(tf.zeros([2,2], tf.float32))
l = tf.Variable(tf.zeros([5,6,5], tf.float32))
|
Besides the tf.zeros() and tf.ones(), which create a Tensor initialized to zero or one , there is also the tf.random_normal() function which create a tensor filled with values picked randomly from a normal distribution (the default distribution has a mean of 0.0 and stddev of 1.0).
There is also the tf.truncated_normal() function, which creates an Tensor with values randomly picked from a normal distribution, where two times the standard deviation forms the lower and upper limit.
With this knowledge, we can already create weight matrices and bias vectors which can be used in a neural network.
1
2
3
4
5
6
|
weights = tf.Variable(tf.truncated_normal([256 * 256, 10]))
biases = tf.Variable(tf.zeros([10]))
print(weights.get_shape().as_list())
print(biases.get_shape().as_list())
>>>[65536, 10]
>>>[10]
|
1.2. Tensorflow Graphs and Sessions
In Tensorflow, all of the different Variables and the operations done on these Variables are saved in a Graph. After you have build a Graph which contains all of the computational steps necessary for your model, you can run this Graph within a Session. This Session then distributes all of the computations across the available CPU and GPU resources.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
graph = tf.Graph()
with graph.as_default():
a = tf.Variable(8, tf.float32)
b = tf.Variable(tf.zeros([2,2], tf.float32))
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print(f)
print(session.run(f))
print(session.run(k))
>>> <tf.Variable ‘Variable_2:0’ shape=() dtype=int32_ref>
>>> 8
>>> [[ 0. 0.]
>>> [ 0. 0.]]
|
1.3 Placeholders and feed_dicts
We have seen the various forms in which we can create constants and variables. Tensorflow also has placeholders; these do not require an initial value and only serve to allocate the necessary amount of memory. During a session, these placeholder can be filled in with (external) data with a feed_dict.
Below is an example of the usage of a placeholder.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
list_of_points1_ = [[1,2], [3,4], [5,6], [7,8]]
list_of_points2_ = [[15,16], [13,14], [11,12], [9,10]]
list_of_points1 = np.array([np.array(elem).reshape(1,2) for elem in list_of_points1_])
list_of_points2 = np.array([np.array(elem).reshape(1,2) for elem in list_of_points2_])
graph = tf.Graph()
with graph.as_default():
#we should use a tf.placeholder() to create a variable whose value you will fill in later (during session.run()).
#this can be done by ‘feeding’ the data into the placeholder.
#below we see an example of a method which uses two placeholder arrays of size [2,1] to calculate the eucledian distance
point1 = tf.placeholder(tf.float32, shape=(1, 2))
point2 = tf.placeholder(tf.float32, shape=(1, 2))
def calculate_eucledian_distance(point1, point2):
difference = tf.subtract(point1, point2)
power2 = tf.pow(difference, tf.constant(2.0, shape=(1,2)))
add = tf.reduce_sum(power2)
eucledian_distance = tf.sqrt(add)
return eucledian_distance
dist = calculate_eucledian_distance(point1, point2)
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
for ii in range(len(list_of_points1)):
point1_ = list_of_points1[ii]
point2_ = list_of_points2[ii]
feed_dict = {point1 : point1_, point2 : point2_}
distance = session.run([dist], feed_dict=feed_dict)
print(“the distance between {} and {} -> {}”.format(point1_, point2_, distance))
>>> the distance between [[1 2]] and [[15 16]] -> [19.79899]
>>> the distance between [[3 4]] and [[13 14]] -> [14.142136]
>>> the distance between [[5 6]] and [[11 12]] -> [8.485281]
>>> the distance between [[7 8]] and [[ 9 10]] -> [2.8284271]
|
2. Neural Networks in Tensorflow
2.1 Introduction
The graph containing the Neural Network (illustrated in the image above) should contain the following steps:
- The input datasets; the training dataset and labels, the test dataset and labels (and the validation dataset and labels).
The test and validation datasets can be placed inside a tf.constant(). And the training dataset is placed in a tf.placeholder() so that it can be feeded in batches during the training (stochastic gradient descent). - The Neural Network model with all of its layers. This can be a simple fully connected neural network consisting of only 1 layer, or a more complicated neural network consisting of 5, 9, 16 etc layers.
- The weight matrices and bias vectors defined in the proper shape and initialized to their initial values. (One weight matrix and bias vector per layer.)
- The loss value: the model has as output the logit vector (estimated training labels) and by comparing the logit with the actual labels, we can calculate the loss value (with the softmax with cross-entropy function). The loss value is an indication of how close the estimated training labels are to the actual training labels and will be used to update the weight values.
- An optimizer, which will use the calculated loss value to update the weights and biases with backpropagation.
To read original post click here
Recent Comments