Linear Regression

Topic Overview

Linear Regression serves as an introductory example to “data science”. The algorithm is simple to use, and the problems (predictions, usually) that it solves are easy to understand.

Here we use weather history data 1 to demonstrate how

  • a model is built

  • a model is verified

  • a model is used to predict

To keep the demo simple, the set of input features is one-dimensional - containing only the minimum day temperature -, as is the set of output features - maximum day temperature. Normally, at least the input set is multi dimensional, and carefully chosen. The bigger part of data science revolves around just this discipline - choosing the set of input features.

Also, you will agree that is complete nonsense to choose “minimum day temperature” as single input feature, and to predict the maximum day temperature from it. We do just that.

Artifacts

Jupyter Notebook

A Jupyter Notebook (download: linear_regression.ipynb) is the heart of the topic. It has much of the code, together with explanations.

Running Code

While such notebooks are cute for research papers and for trying around, one should not use them as an IDE and/or runtime environments - they are not.

Here’s a running program that does the same, but does not show the fancy graphs that nobody needs.

Dataset

Both the notebook and the program use a test dataset (download: history_data.csv).

Topic Dependencies

cluster_python Python cluster_python_drafts Python Drafts cluster_python_drafts_ai Machine Learning, Artificial Intelligence cluster_python_basics Basics cluster_python_swdev Software Development python_drafts_import The import Statement (incomplete) python_swdev_modules Modules and Packages python_drafts_import->python_swdev_modules python_drafts_venv Virtual Environments python_drafts_venv->python_drafts_import python_drafts_pip Python Package Index python_drafts_venv->python_drafts_pip python_drafts_pip->python_drafts_import python_drafts_ai_machine_learning_intro Machine Learning: Concepts and Terminology python_drafts_ai_linear_regression Linear Regression python_drafts_ai_linear_regression->python_drafts_venv python_drafts_ai_linear_regression->python_drafts_ai_machine_learning_intro python_basics_python_0130_syntax_etc Syntax etc. python_basics_python_0120_helloworld Hello World python_basics_python_0130_syntax_etc->python_basics_python_0120_helloworld python_basics_python_0150_datatypes_overview Datatypes python_basics_python_0140_variables Variables python_basics_python_0150_datatypes_overview->python_basics_python_0140_variables python_basics_python_0150_datatypes_overview_compound Compound Datatypes python_basics_python_0150_datatypes_overview_compound->python_basics_python_0150_datatypes_overview python_basics_python_0200_sequential_types Sequential Datatypes python_basics_python_0200_sequential_types->python_basics_python_0150_datatypes_overview_compound python_basics_python_0140_variables->python_basics_python_0130_syntax_etc python_basics_python_0110_blahblah Blahblah python_basics_python_0120_helloworld->python_basics_python_0110_blahblah python_basics_python_0170_if The if Statement python_basics_python_0160_boolean Boolean python_basics_python_0170_if->python_basics_python_0160_boolean python_basics_python_0270_functions Functions python_basics_python_0270_functions->python_basics_python_0150_datatypes_overview python_basics_python_0270_functions->python_basics_python_0140_variables python_basics_python_0193_while while Loops python_basics_python_0193_while->python_basics_python_0170_if python_basics_python_0193_while->python_basics_python_0160_boolean python_basics_python_0450_dictionaries More on Dictionaries python_basics_python_0450_dictionaries->python_basics_python_0150_datatypes_overview_compound python_basics_python_0220_for for Loops python_basics_python_0450_dictionaries->python_basics_python_0220_for python_basics_python_0220_for->python_basics_python_0200_sequential_types python_basics_python_0220_for->python_basics_python_0193_while python_basics_python_0160_boolean->python_basics_python_0150_datatypes_overview python_swdev_oo Object Oriented Programming python_swdev_oo->python_basics_python_0270_functions python_swdev_oo->python_basics_python_0450_dictionaries python_swdev_modules->python_basics_python_0140_variables python_swdev_modules->python_basics_python_0270_functions python_swdev_modules->python_swdev_oo

Footnotes

1

Although such data is always the first dataset every aspiring data scientist uses, it is not easy find one free dataset. I spent the bigger part of the research for this set of topics searching for appropriate datasets.