Machine Learning into Action V1 — The Journey of an Android Developer

6 min readDec 20, 2017

The author of this article is Mayur Kanojia, An Adro-Geek (Android Developer) of our team “iView Crafters”, presenting his fine R&D on “MACHINE LEARNING”.

The first impression of hearing the word “Machine Learning” I was almost imagining myself as a teacher and 50–60 machines will come to my class for learning.

Jokes apart but Machine Learning in terms of words is defined as following:

Machine Learning: Field of

study that gives computers the ability to learn

without being explicitly programmed

- Arthur Samuel (1959).

Machine Learning was never a thought in my head but at Iview you need to be extraordinarily skillful to uplift your performance curve ahead than any average programmer. I was going to opt for GATE after May 2017 but our CTO — Meena Shah adviced me to step forward in the world of Machine learning. Challenges drive me! Hence I vividly started my journey wondering it to be as cool as I mentioned in the very first line.

The biggest question was how to kick-start? Perhaps I did exactly what all the Techies love to — “Google Baba”. Google Baba has a lot to say about Machine Learning. Someone said it so true THE WORLD IS ROUND and so is Google. Every theory of Machine Learning at the end had only one term and that was “Gradient Decent”. While hands-on to the Internet I learned about everybody referring to Prof. Andrew N.G’s examples about ML. There are several online portals offering specialization in Machine Learning such as Coursera, Udemy, Udacity, Pyimagesearch, Pythonprogramming.net. So I decided to go for Coursera course for ML by Andrew N. G. Equivalently, Pyimagesearch and Pythonprogramming.Net improved my knowledge in ML by coding perspective.

Frankly, I was disinterested initially because ML started my VCLA and Maths-3 again. In my college I took this subjects lightly by thinking that this will not help me in future but when Prof. Andrew N. G had started the way of using math (vectorization) in coding to optimise code and also running algorithm faster. Whow!! that amazed me. Here is the sample code.

Now after learning basics of Linear Regression and Logistic Regression I came across to Neural Network and this thing had confused my head a lot between Forward Propagation and Backward Propagation. It was like I have to do FW BW FW FW BW BW BW FW FW FW BW.

I was tired of doing FW and BW and then my life savior was YouTube, it’s my entertainment buddy. One day accidentally I searched for How Neural Network works :P and then youtube took me into a deep ocean with the amazing example of Neural Network like Mario Game completed by ML V O.o. Also, it took me into the Vista of my whole childhood hours that I spent on Mario Game and was just completed within minutes in front of my eyes. In Future we the “iView Crafters” are keen to take up this R&D into reality by training Neural Network on how to drive a car in GTA V.

These are the links :

https://youtu.be/qv6UVOQ0F44

https://youtu.be/QVyu9oVyh9Q

So I thought we can train Neural Network to play this game which is outdated nowadays. But this video had proved me wrong. This guy had trained a neural network on to how to drive a car in GTA V and this guy’s material helped a lot in understanding neural network.

https://youtu.be/edWI4ZnWUGg

If you are the beginner like me then you will face following words in your journey of ML.

the cost function, gradient decent, linear Regression, logical Regression, regularisation, KNN, Neural Network, training data, testing data, cross-validation, pipeline, loss, epoch and Blah Blah Blah

At iView Labs, We don’t believe in theories we believe in results.

Knowledge is of no value unless you put it into practice.

- Anton Chekhov

So to convert my knowledge into a practical solution, initially I started learning python from pythonprogramming.net and trust me will learn basics of python by just watching 15–16 videos, if you know programming concepts well. After learning python, I started developing my 1st ML program by just converting my Coursera exercise into python code. (You can check this blog to understand your Coursera basics into python)

This exercise is about Linear Regression where I have to predict Boston House prices by the data given to me. I created one method which gives me the total value of cost function.

J = (h(x) — yi) ^2

and one gradient decent algorithm which reduce my cost function.

θj= θj — α * (Ə J(θ)/ Əθj)

where α is learning rate

In Linear Regression, You have to plot data first to see how actually your data is. From the curve of your data you can decide whether you are suffering from high bias(under fitting) or high variance (overfitting) according to this you can understand that your model needs more training data or tuning learning rate high or low etc. You can find my code for practice here.

So this way I created basic Linear Regression Machine Learning program, also I had done same code in Octave. The doubt to strike your head would be then Why I choose python to code? One of my teammate in iView guided me that python has some awesome libraries for machine learning where you don’t have to worry about these basics. You have to just remember algorithm’s name that you have to choose for your program. And These best ML Libraries are Tensorflow, sklearn, keras. so I coded one program in python to predict stock price by using the sklearn library. The example is here.

Meanwhile, our CTO was super enthusiastic about the conversion of my learning to the real-time execution. So we decided to implement the concept of Pipeline from my ML theory into one of our Live projects. We buckled up our shoes and started R&D on “Bill Scanner” — An OCR that recognizes characters from Bill images.

The empathy is that during ML Training, Handwritten digits scanning with MNIST Dataset is said to be “Hello World” level programme of Machine Learning. Hence super excitedly We started the task and as the days went off my cognition was wrong. Before predictions, we have to do lots of tasks, image processing etc.

Pipeline :

Object Detection
Text Extraction
Character Segmentation
Character Prediction

1. Object Detection

To detect object successfully, We will use tensorflow object_detection API. To use this API, we have to make model according to our number of class. For object detection we have collected chunks of images to train our model. After days of training we got 99% results.

2.Text Extraction

To extract text from bill images We had used Computer vision API by Open Cv, we processed the image with lots of filters and extracted the texts.

3. Character Segmentation

After extracting text data we got output like this for each text area.

4. Character Prediction

After all the steps above, you will get separate characters, Now you have to apply the neural network to predict character from an image.

This is #63rd Day of our 120 Day Machine learning Project and the quantum of implementation of Pipeline in our Project “Bill Scanner” shall be briefed in Volume — II .