# Time series analysis in machine learning

While machine learning has become more popular in recent years, there remains one significant stumbling block for the most common algorithms: time series analysis. Most types of machine learning specialize in finding an outcome or attribute associated with a fixed variable — with logic derived from other similar attribute-variable relationships. Machine translation is a good example of this, finding the equivalent for a term in another language by using information garnered from other similar translations.

When considering trend and time-based information, the machine learning algorithm has to contend with a very distinct set of variables – and new problems associated with them. To overcome these difficulties, different types of algorithms are required. In this blog, we discuss time series analysis and the ways that machine learning is beginning to overcome it.

## What is a time series?

A time series is a sequence of data in chronological order, with each datapoint attributed to a specific point in time. The simplest example would be temperature over time, with seasonal variation in line with changing climates. Predicting future variables in these datasets, known as time series forecasting, is an important objective that machine learning aims to fulfill. This has several real-world applications, including weather forecasting, earthquake prediction, statistics, and perhaps most useful for business — market forecasting.

When it comes to time series analysis, there are two main types of datasets: univariate and multivariate.

- A
**univariate time series**involves time and one other variable, for example temperature. An analysis of the temperature at each time point can give a clue as to what the result will be in future

- A
**multivariate time series**accounts for other factors that may influence that variable, for instance wind speed, pressure, and rainfall.

In practice, multivariate time series are used more often than univariate. This is because time is rarely the only factor influencing a variable.

## The different types of series

The results gained from both types of time series analysis can be grouped into four categories, based on the trends that they demonstrate over time.

### Trend

This occurs when data generally moves in one direction over a given period, either increasing or decreasing, such as inflation.

### Seasonal

Seasonal data occurs when fluctuations and patterns in data tend to correspond to seasons, such as temperature in summer or consumer demand for gifts during the holidays.

### Cyclical

A cyclical time series occurs when trends and variations appear with no fixed period. House buying trends are an example of this, since they tend to ebb and flow over long periods without any particular seasonal or time-based factor.

### Irregular

An irregular time series demonstrates no clear variation over any time period. These datasets are the most difficult to analyze and predict.

By understanding the different types of time series datasets, we can understand more about how machine learning seeks to analyze this information and predict future variables.

## The challenge of time series analysis

On the surface, it may seem as if time is just another type of variable that machine learning algorithms can learn to adapt. In reality, there are several features of time series analysis that make traditional machine learning algorithms difficult to apply. Let’s continue with the temperature example. Difficulties could include:

- A limited data set. If you create a datapoint for each day, week, or month, you’d have to have several years’ worth of data before you can create the thousands of datapoints that traditional machine learning needs to be most effective. Much of the time, this amount of data isn’t available, or becomes irrelevant after a certain point.

- Variables such as wind speed or pressure might contribute more to temperature than others, switching at different times – but we don’t know to what extent each is a contributing factor.

- In a time series, there’s a much higher likelihood of “black swan” events occurring; unpredictable outlier variables that have a significant impact on results. Time series forecasting tends to perform badly when navigating these events.

- Predicting the future requires an element of probability, which is not a function that traditional machine learning algorithms are designed to include.

These challenges in time series analysis are not problems that can be overcome with a simple tweak to an algorithm. To an extent, it will always be guesswork; no algorithm can predict the future with 100% accuracy. The technology is improving with the advent of more sophisticated algorithms designed to negotiate these challenges.

Vector autoregression (VAR), for instance, can be useful when analyzing the effect of variables that influence each other, such as wind speed and pressure, both of which then affect temperature. Machine learning powered by Bayesian statistical inference can also learn to navigate the difficulties of outlier information and create probability-based outcomes. These are just two of the ways that time series analysis and forecasting is becoming easier and more reliable

## The rise of time series analysis

Due to these improvements, time series analysis is steadily growing more widespread. Things like financial analysis, weather forecasting, yield projections, customer behavior predictions, and much more, are improving because of the ability to better understand the relationship between variables that contribute to a particular outcome.

As the time series forecasting technology continues to improve, the potential to improve life for businesses and individuals everywhere grows more exponential. From checking the weather in the morning to better predicting your yearly sales intake, what is still seen as a fledgling technology is fast becoming a vital part of our everyday lives.