Machine Learning: Concepts and Terminology#

This topic is an introductory one. We will only scratch the surface; there is much terminology to learn, and this is what this topic is about. Data science, machine learning, and artificial intelligence are a very broad field, and there is much more to it.

See this video for a really good and thorough introduction. Beware though, it is a whopping 4 hours long.

Data scientists are not always known to be good (let alone diligent) programmers. I have seen people use Jupyter Notebooks as editor and runtime environment for their programs. While I like notebooks to play around, creating nicely looking web pages with plots and charts as a side effect (really cool), there is a point where one has to bite the bullet and start to program. The latter is the focus of these AI topics.

How Far Is Mankind from Creating God #

Artificial Narrow Intelligence (ANI, Weak AI)
- Stage that we are at now: can solve special problems
- Weather forecast
- Image recognition
- Autonomous driving
Artificial General Intelligence (AGI, Strong AI)
- By far not there: can do everything a human can
Artificial Super Intelligence (ASI)
- Terminator and such

Basic Terminology: Algorithm and Model #

Algorithm. For example …
- Linear regression
- Decision tree
- Random forest
- (many many more)
Model. Trained by using an algorithm.
- Uses the algorithm
- Takes the input and maps it to output
- Built through training.

Basic Terminology: Features and Data #

Input features or predictor variables
- Set of variables used as input to the model
Output features or response/target variables
- Set of variables calculated by the model, based on input features
Training Data.
- Used to create the model (the more the better)
- Divided (spliced) into two parts
  - Training data; used to actually create/train the model
  - Testing data; used to test the efficiency/accuracy of it

Types of Machine Learning #

Supervised Learning.
- Each input training datum has its known/desired output attached as a label.
- Used for regression and classification
Unsupervised Learning.
- Works on unlabeled data.
- Creates clusters on its own, identifying features.
- Used for association and clustering
Reinforcement Learning.
- Agent learns from actions by measuring rewards. Rather advanced. No training. Trial and error.

Problems Solved #

Regression
- Output: continuous quantity (usually a forecast of something)
- Solved by supervised learning algorithms like Linear Regression.
- See topic: Linear Regression
Classification
- Output: categorical quantity (“spam or not”)
- Solved by supervised learning algorithms like
  - Support Vector Machines
  - Naive Bayes
  - Logistic Regression
  - K Nearest Neighbor
Clustering
- Output: clusters of input data
- Solved by unsupervised learning algorithms like K-means
- See topic: K-Means

Machine Learning: Concepts and Terminology#

See Also#

How Far Is Mankind from Creating God#

Basic Terminology: Algorithm and Model#

Basic Terminology: Features and Data#

Types of Machine Learning#

Problems Solved#

See Also #

How Far Is Mankind from Creating God #

Basic Terminology: Algorithm and Model #

Basic Terminology: Features and Data #

Types of Machine Learning #

Problems Solved #