Supervised Learning Paradigms: Regression and Classification

Supervised learning is one of the most widely used approaches in machine learning, where models are trained on labeled data to make predictions or classifications. In this lesson, we will delve into two key paradigms of supervised learning: regression and classification.

What is Supervised Learning?

In supervised learning, the algorithm learns from a dataset that includes both input features and their corresponding output labels. The goal is to approximate the mapping function so well that it can predict the output for new, unseen data.

Types of Supervised Learning Problems

Regression: Predicting Continuous Values

Regression tasks involve predicting outcomes that fall within a continuous range. Examples include predicting house prices, stock prices, or temperature forecasts.

Example: Linear Regression with Scikit-learn

from sklearn.linear_model import LinearRegression
import numpy as np

# Sample data: Features (X) and target (y)
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1.5, 3.1, 4.8, 6.2, 7.9])

# Create and train the model
model = LinearRegression()
model.fit(X, y)

# Make a prediction
prediction = model.predict([[6]])
print(f'Predicted value: {prediction[0]:.2f}')

In this example, we use linear regression to fit a line to the data and predict a continuous value.

Classification: Categorizing Data

Classification tasks assign inputs to discrete categories or classes. Examples include spam detection, image recognition, and sentiment analysis.

Example: Logistic Regression for Binary Classification

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample binary classification data
X = [[1], [2], [3], [4], [5]]
y = [0, 0, 1, 1, 1]

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Evaluate the model
predictions = model.predict(X_test)
print(f'Accuracy: {accuracy_score(y_test, predictions):.2f}')

This snippet demonstrates how to classify data into two categories using logistic regression.

Key Takeaways

Supervised learning encompasses two major paradigms: regression and classification. By mastering these techniques, you can tackle a wide variety of real-world problems. Experiment with libraries like Scikit-learn to build and evaluate your models effectively.