## Date Manipulation in Python for Time Series II

This previous article introduced the importance of correctly handling dates when working with time series data. In Python, there are multiple use cases and tools that must be known. This article is a continuation of the previous one, and we will explore more advanced techniques and tools to manipulate dates Read more…

## Date Manipulation in Python for Time Series

A key component of time series data is times and dates, and Python offers robust tools for effective manipulation. This article will provide a basic exploration of the different tools you have available for those purposes such as indexing, frequency adjustments, parsing dates, and more. Essential Libraries Let’s import the Read more…

## Feature Importance in Machine Learning

Machine learning models often operate in complex data environments where understanding the contribution of each feature to the model’s predictions is crucial. Determining feature importance is a key aspect of model interpretation, enabling us to grasp which factors significantly influence the model’s output. Let’s now explore different methods to determine Read more…

## Stationarity in Time Series and how to check it

In the context of time series analysis, a time series is said to be stationary if its statistical properties such as mean, variance, and autocorrelation, remain constant over time. This means that no matter at what point in time you observe the series, the properties are the same. There are Read more…

## Time Series Forecasting with STL

STL stands for “Seasonal and Trend decomposition using LOESS”. It is a versatile and robust method for decomposing time series. This method decomposes a time series into its three main components: Classical decomposition methods vs. STL Before carrying on with STL, let’s introduce traditional methods. Here you have an example Read more…

## Normal distribution: identifying and handling outliers

Outliers are data points that significantly differ from the rest of the data in a dataset. They are observations that lie at an abnormal distance from other values in a random sample from a population. Identifying and handling outliers is crucial because they can skew results and impact the performance Read more…

## Forecast the popularity of YouTube searches with SARIMA

Numerous countries across the globe gear up for Christmas celebrations, and what better way to celebrate it than with a festive Data Science project? Let’s forecast the popularity of the “All I Want for Christmas” search by Mariah Carey on YouTube in the upcoming weeks. We can get the data from Google Trends. Read more…

## Normal distribution: scaling and missing values

A normal distribution, also known as a Gaussian distribution, is a continuous probability distribution that is symmetrically shaped like a bell curve. The following set of statistical properties characterizes it: The normal distribution is a fundamental concept in statistics and probability theory, and it is widely used in various fields Read more…

## Introduction to Decision Trees

Decision Trees are a fundamental model in machine learning used for both classification and regression tasks. They are structured like a tree, with each internal node representing a test on an attribute (decision nodes), branches representing outcomes of the test, and leaf nodes indicating class labels or continuous values. Decision Read more…

## Extend the possibilities of AWS Lambda with layers

Imagine we want to use pandas or numpy in our AWS Lambda function. However, it is not available by default and we can’t do a pip install here to install them… What do we do? The solution is defining a layer. But, what exactly is a layer? In AWS Lambda, Read more…