Face Recognition with HaarCascade
HaarCascade
Haar Cascade is a
machine learning object detection algorithm used to identify objects in an
image or video and based on the concept of features.
It is a
machine learning based approach where a cascade function is trained from a lot
of positive and negative images. It is then used to detect objects in other
images.
The
algorithm has four stages:7
1. Haar Feature Selection
2. Creating Integral Images
3. Adaboost Training
4. Cascading Classifiers
It is well known for being able to
detect faces and body parts in an image, but can be trained to identify almost
any object.
2. Creating Internal Images – Internal images are created to make the model training super fast. But most of the created images are irrelevant. So we need something to choose the right features.
4. Cascade Classifier – The Cascade Classifier consists of different stages, where each stage is an ensemble of weak learners. The weak learners are simple classifiers called decision stumps. Each stage is trained using a technique called boosting. Boosting provides the ability to train a highly accurate classifier by taking a weighted average of the decisions made by the weak learners.
The stages are designed to reject negative samples as fast as possible.
The assumption is that the vast majority of windows do not contain the object
of interest. Conversely, true positives are rare and worth taking the time to
verify.
·
A true positive occurs
when a positive sample is correctly classified.
·
A false positive occurs
when a negative sample is mistakenly classified as positive.
·
A false negative occurs
when a positive sample is mistakenly classified as negative.
VGG16
VGG16
is a convolutional neural network model proposed by K. Simonyan and A.
Zisserman from the University of Oxford in the paper “Very Deep Convolutional
Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5 test accuracy in
ImageNet, which is a dataset of over 14 million images belonging
to 1000 classes. It was one of the famous model submitted to ILSVRC-2014. It makes the improvement over AlexNet by replacing large
kernel-sized filters (11 and 5 in the first and second convolutional layer,
respectively) with multiple 3×3 kernel-sized filters one after another. VGG16
was trained for weeks and was using NVIDIA Titan Black GPU’s.
The Architecture Of VGG
The input to cov1 layer is of fixed size 224 x 224
RGB image. The image is passed through a stack of convolutional (conv.) layers,
where the filters were used with a very small receptive field: 3×3 (which is
the smallest size to capture the notion of left/right, up/down, center). In one
of the configurations, it also utilizes 1×1 convolution filters, which can be
seen as a linear transformation of the input channels (followed by
non-linearity). The convolution stride is fixed to 1 pixel; the spatial padding
of conv. layer input is such that the spatial resolution is preserved after
convolution, i.e. the padding is 1-pixel for 3×3 conv. layers. Spatial pooling
is carried out by five max-pooling layers, which follow some of the conv.
layers (not all the conv. layers are followed by max-pooling). Max-pooling is
performed over a 2×2 pixel window, with stride 2.
Three Fully-Connected (FC) layers follow a stack of
convolutional layers (which has a different depth in different architectures):
the first two have 4096 channels each, the third performs 1000-way ILSVRC
classification and thus contains 1000 channels (one for each class). The final
layer is the soft-max layer. The configuration of the fully connected layers is
the same in all networks.
All hidden layers are equipped with the rectification (ReLU)
non-linearity. It is also noted that none of the networks (except for one)
contain Local Response Normalisation (LRN), such normalization does not improve
the performance on the ILSVRC dataset, but leads to increased memory
consumption and computation time.
FACE RECOGNITION WITH VGG
Tools Used – Jupyter NoteBook,
Imagenet Dataset, HaarCascade Frontal face, TEnsorFlow, Keras, OpenCV
· COLLECTION OF DATA – We Run a loop to Click 100 pictures through camera and collects the dataset for us.HaarCascade Model is used to Detect the Face.
·
LOADING VGG16 MODEL – VGG16 model is imported from
library keras.applications. We can now use the model in our program.
·
FREEZING PRETRAINED LAYERS- We freeze the pretrained layers of VGG so that they
keeps their same weights for our program. The layers can be freezed by using
for loop.
·
ADDING DENSE LAYERS – We add 3 Dense layers to our model. The activation function used is “RELU
and SOFTMAX”. RELU activation function for hidden layers and SOFTMAX activation
function for output layer.
·
MODEL AND LAYERS – We import SEQUENTIAL model and different layers for our final model.
·
LOADING THE DATASET – We makes 2 partitions of the data – “TRAIN , VALIDATION(TEST)”. Then
load the TRAIN data for model training.
Comments
Post a Comment