Machine Learning – K-Means Clustering in Python

Clustering is an unsupervised learning technique. It is a powerful way to split up datasets into groups based on similarity. Clustering is mainly used for exploratory data mining. It is used in many fields such as machine learning, pattern recognition, image analysis, information retrieval, bio-informatics, data compression, and computer graphics.

A very popular clustering algorithm is k-means clustering. In k-means clustering, we divide data up into a fixed number of clusters while trying to ensure that the items in each cluster are as similar as possible. The goal of this algorithm is to find groups(clusters) in the given data, with the number of groups represented by the variable K. It has many uses for grouping text documents, images, videos, and much more.

Customer segmentation is the biggest use case of K-means.

Assuming we have inputs x1, x2, x3, …, xn and value of K

Step 1 – Pick K random points as cluster centers called centroids
Step 2 – Assign each xi to nearest cluster by calculating its distance to each centroid
Step 3 – Find new cluster center by taking the average of the assigned points
Step 4 – Repeat Step 2 and 3 until none of the cluster assignments change
Continue reading


Supervised Machine Learning – Linear Regression in Python

Linear regression analysis is a powerful technique used for predicting the unknown value of a variable from the known value of another variable. More precisely, if X and Y are two related variables, then linear regression analysis helps us to predict the value of Y for a given value of X or vice verse. The variable whose value is to be predicted is known as the dependent variable and the one whose known value is used for prediction is known as the independent variable.

There are several ways we can do linear regression using numpy, scipy, stats model and scikit learn. But in this post I am going to use scikit learn to perform linear regression.

Scikit-learn is a powerful Python module for machine learning. It contains function for regression, classification, clustering, model selection and dimensionality reduction.

The first step is to import the required Python libraries

import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
from matplotlib import style
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

Continue reading

Setup MongoDB Sharding

mongodbSharding is the process of partitioning data across multiple servers using a shard key. It solves the problem with horizontal scaling. As the size of the data increases, a single machine may not be sufficient to store the data. With sharding, you add more machines to support data growth. If you are new to MongoDB then please read my previous post how to start mongoDB.

Sharding setup requires the following:

  • Mongodb Configuration server – this stores the cluster’s metadata
  • Mongos instance – connecting the config server(s)
  • Individual mongodb instances – these act as the shards.

Continue reading

Search Engine with PHP & Elasticsearch


In this tutorial, we’re going to take a look at Elasticsearch and how we can use it in PHP. ElasticSearch is an open-source and distributed search engine which is very much scalable. We can use it to perform fast full-text and other complex searches. It also includes a REST API which allows us to easily issue requests for creating, deleting, updating and retrieving of data. You can read about Elastic Search at:

Installing Elasticsearch

Download ElasticSearch from This tutorial will assume you’re using a Windows environment. For Windows – it’s a Zip file – one can extract it into C:\elasticsearch-2.2.0\.
Continue reading

Getting Started with MongoDB

mongodbMongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document.

Install MongoDB On Windows
To install the MongoDB on windows, first download the latest release of MongoDB from Make sure you get correct version of MongoDB depending upon your windows version.

MongoDB need a folder (data directory) to store its data. By default, it will store in “C:\data\db”, create this folder manually. MongoDB won’t create it for you. You can also specify an alternate data directory with –dbpath option to mongod.exe, for example:
Continue reading

Creating an Android Application with PhoneGap

phonegap-logoPhoneGap is an open source mobile framework that enables you to create cross-platform apps that run on various mobile devices including iOS and Android. You write your web app in HTML, JavaScript and CSS, and PhoneGap helps you to turn it into native apps, most likely Android or iOS.

There are two ways to make your web app into an Android app using PhoneGap. One is a no-pain ultra easy way, and another way is a manual way using Cordova CLI with an Android SDK.

Option 1: Android Apps with PhoneGap Build

Using PhoneGap Build is absolutely painless. All you need to do is zip up your web app, upload it to Adobe PhoneGap Build cloud, and the service takes care everything for you. You don’t even need to download and set up SDK or emulators.

  • Write a web app
  • Zip it up
  • Upload the zip file to PhoneGap Build cloud. Click Ready to Build. Edit the name of the app and upload an app icon too.
  • Download the apk and install it on your phone

Option 2: Full Development with PhoneGap Cordova CLI

You can install Cordova command-line interface (CLI) tools and Android SDKs for full development with taking advantage of the plugins that enable for you to use hardware APIs, push notifications and more, and debugging.

Continue reading

Node.js for Beginners

nodejsNode.js – in simple words – is server-side JavaScript. It has been getting a lot of buzz these days. I should make it clear that I’m not an expert on Node.js. I decided to learn Node.js recently due to its increasing popularity. The programming industry moves incredibly fast and it’s dangerous to fall behind. Learning new languages is important because if you don’t you’re likely you’ll get left behind and out of a job.

Node.js is a server-side version of JavaScript. That means all the things all them cool things about JavaScript apply here. It also means if you’re already quite familiar with JavaScript you’re going to have a nice advantage.

Creating Hello World

Let’s create a hello world. First head over to and download node.js. When it’s installed and ready create a new JavaScript file with the following:
Continue reading