Topic : PLAYING DETECTIVE WITH CNNS
Abstract: With the growing popularity Convolutional Neural Networks (CNNs) in computer vision tasks and the ability of them to model intricate patterns in data, we took to extending the study of verifying handwriting using CNNs. Handwriting has been considered to be unique for each individual, which are thus still intrastable writers while remaining invariant to within writer folders with no feature engineering as used by previous models. (Individuality of Handwriting, SN Srihari et.al 2002) .newline We use the CEDAR, word-level “” “” and “” “dataset to train and experiment with several CNN architectures based on differences in filter size, feature maps, activation functions, regularization techniques, cost functions, width and depth of the model, pooling operations, strides and optimization techniques. While experimenting with an architecture, we found the model behaved with certain trends, by modeling several of these trends we get an intuitive understanding of the effect of each of these parameters on the model’s performance. We use this understanding in each experiment for developing our final architecture. We further split the final architecture into two CNN architectures with the same parameter tuning along with weight and variable sharing and saw a spike in performance.The primary objective of our experiments was to test how handwriting verification performed without any feature engineering and using Convolutional Neural Networks for feature extraction. However, we also gain several insights into trends in different architecture results with respect to the dataset and develop a final combination of parameters that was found to perform the best. We hope these insights could be used while tuning more complex models.
We test the model’s performance by identifying two types of error based on model writers: 1) Samples from a known writer – This includ es for testing variation within writer’s samples seen before as well as between writers 2) Samples from an unknown writer- es for testing the generalizing capability of the model on samples of never fore examine writers. By measuring these two kinds of errors, we arrive at an architecture that performed optimally for both.
Bio: Graduate Research Assistant at Sentient Science