Python Linear Regression Tutorial: From Scratch to scikit-learn Implementation

Date:

Share post:

Linear regression is used for predicting a continuous dependent variable based on one or more independent variables. It’s one of the simplest and most widely used algorithms for predictive analysis.

Here is an example of how you can implement Linear Regression in Python using the popular machine learning library scikit-learn. This example includes creating a simple dataset, training a Linear Regression model, and making predictions.

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Generate a simple dataset
# Let's create a dataset with a linear relationship
np.random.seed(0)
X = 2 * np.random.rand(100, 1)  # Features
y = 4 + 3 * X + np.random.randn(100, 1)  # Labels

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Linear Regression model
model = LinearRegression()

# Train the model using the training data
model.fit(X_train, y_train)

# Make predictions using the testing data
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error (MSE):", mse)
print("R-squared (R2) Score:", r2)

# Plot the results
plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X_test, y_pred, color='red', linewidth=2, label='Regression Line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.title('Linear Regression Example')
plt.show()

Output of above Linear Regression Program

Mean Squared Error (MSE): 0.9177532469714291
R-squared (R2) Score: 0.6521157503858556

Explanation:

  1. Import necessary libraries:
    • numpy for numerical operations.
    • matplotlib.pyplot for plotting.
    • sklearn.model_selection.train_test_split for splitting the dataset into training and testing sets.
    • sklearn.linear_model.LinearRegression for creating the Linear Regression model.
    • sklearn.metrics for evaluating the model’s performance.
  2. Generate a simple dataset:
    • Create a dataset with a linear relationship using numpy.
    • X represents the feature(s) and y represents the target variable.
  3. Split the dataset:
    • Use train_test_split to split the dataset into training and testing sets. 80% of the data is used for training and 20% for testing.
  4. Create and train the model:
    • Instantiate the LinearRegression model.
    • Fit the model using the training data.
  5. Make predictions:
    • Use the trained model to make predictions on the testing data.
  6. Evaluate the model:
    • Calculate the Mean Squared Error (MSE) and the R-squared (R2) score to evaluate the model’s performance.
  7. Plot the results:
    • Plot the original data points and the regression line to visualize the relationship.

This code will output the MSE and R2 score, giving you an idea of the model’s accuracy, and it will plot the regression line along with the original data points.

QABash Nexus—Subscribe before It’s too late!

Monthly Drop- Unreleased resources, pro career moves, and community exclusives.

Ishan Dev Shukl
Ishan Dev Shukl
With 13+ years in SDET leadership, I drive quality and innovation through Test Strategies and Automation. I lead Testing Center of Excellence, ensuring high-quality products across Frontend, Backend, and App Testing. "Quality is in the details" defines my approach—creating seamless, impactful user experiences. I embrace challenges, learn from failure, and take risks to drive success.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Advertisement

Related articles

Vibium AI: The $3.8 Billion Promise That Doesn’t Exist Yet—Why QA Teams Are Going Crazy Over Vaporware

The Most Anticipated Software Tool That You Can't Actually Use The testing world has gone absolutely insane over Vibium AI—Jason Huggins' promised...

Free MCP Course by Anthropic: Learn Model Context Protocol to Supercharge AI Integrations

Model Context Protocol (MCP): The Secret Sauce Behind Smarter AI Integrations If you’ve ever wished you could connect Claude...

Jason Huggins’ Bold Vision for Vibium and the Future of AI Testing

Following Jason Huggins' revealing interview on the TestGuild Automation Podcast, here's a comprehensive analysis of his latest venture—Vibium....

Mastering Web Application Debugging: Playwright MCP with GitHub Copilot Integration

The Challenge Every QA Professional Faces Picture this scenario: You receive a detailed bug report with clear reproduction steps,...