In this tutorial we will implement a simple linear regression with tensorflow. So what is linear regression ? In a nutshell it’s an efficient way to calculate the correlation between two variables. These two variables are identified as independent and dependent variables. We all use regression on an intuitive level every day. For this tutorial we can use real estate data(house area and price). Although real estate prices depends on all sort of variables for this example we assume that house prices increases with house area.

For this let’s create a sample data set with the use of numpy and use matplotlib to visualize the data. So let’s import the packages we need to start.

Data Set
Now we will set a random seed. The “seed” is a starting point for the random sequence and the guarantee is that if you start from the same seed you will get the same sequence of numbers every time you run it. So it’s easier to debug the system.

In here we have randomly generated 100 house seizes with minimum size 1000 to 3500. and the price multiplied by 100 and added with a small random bias.
Now lets plot the data and see.

From the graph we can observe the data has been plotted in a linear way with some slight variations.
Data preprocessing
First we need to split the data set into two for training and testing. Let’s have 70% of the data for training and 30% for testing.

Normalizing the data is improves the performance of the neural network especially in multivariable linear regression. We can normalize the data by reducing the value from the mean and divide by the standard deviation of the values.



Now lets plot the training data set and see how it is normalized.

Now we can see all the data points have been shifted to the range of -1.5 to +1.5.
Define the placeholders and variables.
Placeholder are used to feed the training data to the tensors. The data we need to feed are the house price and size.

We should also define two variables which are the weight and bias needed for the affine transformation which will change over the training period.
Define the loss function
The loss function or the cost describes how far off the result your network produced is from the expected result – it indicates the magnitude of error your model made on its prediction.
You can then take that error and ‘back propagate’/optimize it through your model, adjusting its weights and making it get closer to the truth the next time around.
Optimization, finally, is how you search the space of represented models to obtain better evaluations. The strategy of getting to where you want to go.

Train
Now lets initialize the variables, launch the computation graph in the session and print out the display status.

Plot the data
As the training is over lets plot the data in the graph and see how our regression like looks like.

Now we have seen how we can implement a linear regression with tensorflow. But we can see there is a whole bunch of small level task we have do in order to make it work, for example defining the cost function , iterations etc. Wouldn’t it be easier if we could reduce all of these lower level tasks. The answer is yes, we can simplify it more with tensorflow estimators which provides a higher level abstraction on top of tensorflow and Keras. In the following tutorials we’ll explore how we can implement something similar with Keras and Tensorflow’s estimators.