Evaluation metrics, also known as performance measures or evaluative metrics, are quantitative measurements used to evaluate the performance and quality of a model or algorithm in solving a particular problem. It provides a standardized way to evaluate and compare different models and algorithms based on specific criteria.
Error Metrics
Error metrics focus on measuring the accuracy and magnitude of errors in the forecasted values when compared to the actual values. Read more about them here.
Mean Absolute Error (MAE)
Use MAE when you want an evaluation metric that represents the average absolute difference between predicted and actual values, providing a simple and interpretable measure of overall model performance.
# Install library
!pip install scikit-learn
# Import function
from sklearn.metrics import mean_absolute_error
# Calculate MAE
mae = mean_absolute_error(actual_values, predicted_values)
print("Mean Absolute Error:", mae)
Mean Squared Error (MSE)
Consider using MSE when you want to penalize larger prediction errors more than smaller ones, making it suitable for applications where outliers or extreme values need to be emphasized in the evaluation.
# Install library
!pip install scikit-learn
# Import function
from sklearn.metrics import mean_squared_error
# Calculate MSE
mse = mean_squared_error(actual_values, predicted_values)
print("Mean Squared Error:", mse)
Stay up-to-date with our latest articles!
Subscribe to our free newsletter by entering your email address below.
Root Mean Squared Error (RMSE)
Opt for RMSE when you want the advantages of MSE but you want to express the evaluation metric in the same units as the target variable, which helps in better understanding the magnitude of prediction errors compared to the original scale of the data.
# Install library
!pip install scikit-learn
# Import libraries
from sklearn.metrics import mean_squared_error
import numpy as np
# Calculate MSE
mse = mean_squared_error(actual_values, predicted_values)
# Calculate RMSE
rmse = np.sqrt(mse)
print("RMSE:", rmse)
Mean Absolute Percentage Error (MAPE)
Use MAPE when you need an evaluation metric that expresses prediction errors as a percentage of the actual values, which is helpful for understanding the relative performance across different datasets.
# Install library
!pip install statsmodels
# Import function
from statsmodels.tools.eval_measures import mean_absolute_percentage_error
# Calculate MAPE
mape = mean_absolute_percentage_error(actual_values, predicted_values) * 100
print("MAPE:", mape, "%")
Symmetric Mean Absolute Percentage Error (SMAPE)
Consider using SMAPE when you want a percentage-based evaluation metric that avoids issues with division by zero and provides a symmetric measure of prediction accuracy.
# Import libraries
import numpy as np
# Define function
def sym_mean_absolute_percentage_error(actual, predicted):
"""
Calculate SMAPE (Symmetric Mean Absolute Percentage Error).
"""
return 2 * np.mean(np.abs(actual - predicted) / (np.abs(actual) + np.abs(predicted))) * 100
# Calculate SMAPE
smape = sym_mean_absolute_percentage_error(actual_values, predicted_values)
print("SMAPE:", smape, "%")
Mean Absolute Scaled Error (MASE)
Use MASE when you have time series data with different scales or seasonal patterns and want to compare the accuracy of forecasting models, as it measures the performance of a model relative to the mean absolute error of a naive, non-seasonal forecasting method, making it suitable for assessing forecast accuracy across various datasets.
# Install library
!pip install scikit-learn
# Import libraries
from sklearn.metrics import mean_absolute_error
def mean_absolute_scaled_error(actual, predicted):
"""
Calculate MASE (Mean Absolute Scaled Error).
"""
mae = mean_absolute_error(actual, predicted)
naive_error = np.mean(np.abs(actual[1:] - actual[:-1]))
return mae / naive_error
# Calculate MASE
mase = mean_absolute_scaled_error(actual_values, predicted_values)
print("MASE:", mase)
You can see a summary table here:
MAE | MSE | RMSE | MAPE | SMAPE | MASE | |
---|---|---|---|---|---|---|
Simplicity | High | Medium | Medium | High | Medium | Medium |
Interpretability | High | Low | High | High | High | Low |
Scale-Invariant | No | No | No | No | No | Yes |
Penalizes large errors | No | Yes | Yes | No | No | No |
Outlier sensitivity | Medium | High | High | Medium | Medium | Medium |
Symmetric | Yes | Yes | Yes | No | Yes | Yes |
Original scale | Yes | No | Yes | No | No | No |
Sensitivity to division by 0 | No | No | No | Yes | Yes | Yes |
Performance Metrics
Performance metrics go beyond the conventional error magnitude measurements and play a vital role in assessing alternative aspects of forecasting models. They are a perfect complement to the error metrics. Read more about them here.
Forecast Bias
Use this metric when you want to assess the systematic overestimation or underestimation of your forecasted values compared to the actual data, helping you identify consistent errors in your predictions.
# Calculate forecast differences
differences = [predicted - actual for predicted, actual in zip(predicted_values, actual_values)]
# Calculate forecast bias
bias = sum(differences) / len(differences)
print("Forecast Bias:", bias)
Forecast Interval Coverage (FIC)
Utilize this metric when you need to evaluate the accuracy of your prediction intervals (confidence intervals) around the forecasted values, indicating how well they capture the true variability and uncertainty in future observations.
def calculate_coverage(forecasted_intervals, actual_values):
# Calculate the number of actual values that fall within the intervals
num_within_interval = sum((lower <= actual <= upper) for actual, (lower, upper) in zip(actual_values, forecasted_intervals))
# Calculate the total number of observations
total_observations = len(actual_values)
# Calculate the coverage or FIC by dividing both values
fic = num_within_interval / total_observations * 100
return fic
# Calculate FIC
fic = calculate_coverage(forecasted_intervals, actual_values)
print("Forecast Interval Coverage:", fic, '%')
Prediction Direction Accuracy (PDA)
Consider using this metric when your focus is on the correct direction of predictions (e.g., up or down) rather than the specific magnitude, which is particularly relevant in financial and binary forecasting applications.
def calculate_pda(predicted_values, actual_values):
# Initialize the value of the correct directions variable
correct_directions = 0
# Iterate each value
for i in range(1, len(predicted_values)):
# Calculate predicted and actual directions
pred_change = predicted_values[i] - predicted_values[i - 1]
actual_change = actual_values[i] - actual_values[i - 1]
# Check if the predictions match the actual directions
if (pred_change > 0 and actual_change > 0) or (pred_change < 0 and actual_change < 0):
correct_directions += 1
# Calculate PDA
pda = (correct_directions / (len(predicted_values) - 1)) * 100
return pda
# Calculate PDA
pda = calculate_pda(predicted_values, actual_values)
print("Prediction Direction Accuracy:", pda, '%')
0 Comments