How does a neural network work and how can we improve them for individual cases?
To make the experience fit your profile, pick a username and tell us what interests you.
We found and based on your interests.
I trained my first customised network, as designed in the previous log, on an Amazon AWS GPU and successfully got convergence, which means the network does work:
After messing about with some networks for a couple of months in an attemt to design computer vision for machines, I started to notice a few re-occurring topics and buzz words such as 'prototext' and 'weights'.
Curiosity has now got the better of me and I worked out that the various prototext files associated with a network describe it's structure in reasonably simple and human readable terms eg:
layer {
name: "pool2/3x3_s2"
type: "Pooling"
bottom: "conv2/norm2"
top: "pool2/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
I presume that this is a python layer, but I'm not sure, but it does not matter for now.
The fantastic Nvidia Digits software enables print out of a fancy graphical representation of the whole network and, starting with the renowned bvlc_googlenet.caffemodel, I thought I'd try and hack it and learn something through experimentation.
One of the first thing I looked for was symmetry and repetition, with the desire to simplify what initially look very complicated. I noticed that the above layer describes a 'link' between other blocks of layers that seem to repeat themselves about 6 times:
...... in the massive bvlc_googlenet network:
..... and in this way I managed to simplify it by removing what looked like about 6 large blocks of repeating layers to this:
...... And looking at this diagram very carefully, there's still one big block that repeats and should also be able to be removed. I tried removing it, but unfortunately gave this error:
Creating layer coverage/sig Creating Layer coverage/sig coverage/sig <- cvg/classifier coverage/sig -> coverage Setting up coverage/sig Top shape: 2 1 80 80 (12800) Memory required for data: 851282688 Creating layer bbox/regressor Creating Layer bbox/regressor bbox/regressor <- pool5/drop_s1_pool5/drop_s1_0_split_1 bbox/regressor -> bboxes Setting up bbox/regressor Top shape: 2 4 80 80 (51200) Memory required for data: 851487488 Creating layer bbox_mask Creating Layer bbox_mask bbox_mask <- bboxes bbox_mask <- coverage-block bbox_mask -> bboxes-masked Check failed: bottom[i]->shape() == bottom[0]->shape()
...... So that's it for now ..... and here's my 'simplified' network prototext for object detection:
# DetectNet network
# DetectNet network
# Data/Input layers
name: "DetectNet"
layer {
name: "train_data"
type: "Data"
top: "data"
data_param {
backend: LMDB
source: "examples/kitti/kitti_train_images.lmdb"
batch_size: 10
}
include: { phase: TRAIN }
}
layer {
name: "train_label"
type: "Data"
top: "label"
data_param {
backend: LMDB
source: "examples/kitti/kitti_train_labels.lmdb"
batch_size: 10
}
include: { phase: TRAIN }
}
layer {
name: "val_data"
type: "Data"
top: "data"
data_param {
backend: LMDB
source: "examples/kitti/kitti_test_images.lmdb"
batch_size: 6
}
include: { phase: TEST stage: "val" }
}
layer {
name: "val_label"
type: "Data"
top: "label"
data_param {
backend: LMDB
source: "examples/kitti/kitti_test_labels.lmdb"
batch_size: 6
}
include: { phase: TEST stage: "val" }
}
layer {
name: "deploy_data"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 640
dim: 640
}
}
include: { phase: TEST not_stage: "val" }
}
# Data transformation layers
layer {
name: "train_transform"
type: "DetectNetTransformation"
bottom: "data"
bottom:...
Read more »
Create an account to leave a comment. Already have an account? Log In.
Become a member to follow this project and never miss any updates