Linear Regression

Topic Overview

Linear Regression serves as an introductory example to “data science”. The algorithm is simple to use, and the problems (predictions, usually) that it solves are easy to understand.

Here we use weather history data 1 to demonstrate how

  • a model is built

  • a model is verified

  • a model is used to predict

To keep the demo simple, the set of input features is one-dimensional - containing only the minimum day temperature -, as is the set of output features - maximum day temperature. Normally, at least the input set is multi dimensional, and carefully chosen. The bigger part of data science revolves around just this discipline - choosing the set of input features.

Also, you will agree that is complete nonsense to choose “minimum day temperature” as single input feature, and to predict the maximum day temperature from it. We do just that.

Artifacts

Jupyter Notebook

A Jupyter Notebook (download: linear_regression.ipynb) is the heart of the topic. It has much of the code, together with explanations.

Running Code

While such notebooks are cute for research papers and for trying around, one should not use them as an IDE and/or runtime environments - they are not.

Here’s a running program that does the same, but does not show the fancy graphs that nobody needs.

Dataset

Both the notebook and the program use a test dataset (download: history_data.csv).

Footnotes

1

Although such data is always the first dataset every aspiring data scientist uses, it is not easy find one free dataset. I spent the bigger part of the research for this set of topics searching for appropriate datasets.

Topic Dependencies

cluster_python Python Programming: From Absolute Beginner to Advanced Productivity cluster_python_swdev Python: Project/Package Management cluster_python_misc Python: Miscellaneous Topics cluster_python_misc_ai Machine Learning, Artificial Intelligence cluster_python_advanced Python: More Language Features python_swdev_pip Python Package Index python_misc_import The import Statement (incomplete) python_swdev_pip->python_misc_import python_swdev_venv Virtual Environments python_swdev_venv->python_swdev_pip python_swdev_venv->python_misc_import python_advanced_modules Modules and Packages python_misc_import->python_advanced_modules python_misc_ai_machine_learning_intro Machine Learning: Concepts and Terminology python_misc_ai_linear_regression Linear Regression python_misc_ai_linear_regression->python_swdev_venv python_misc_ai_linear_regression->python_misc_ai_machine_learning_intro