New Series of Time Series: Part 1

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

decomposetrendseasonal

In this new series I am going to look at the time series models and how we able to use them for forecasting data.

The first need in timeseries data is to have a series of data for years or for some qurdant.

imagin that we have a time series data on birth rate from 1946 december to 1956 (file http://robjhyndman.com/tsdldata/data/
nybirths.dat) .

1- Read Data

this data has some information about the number of birth in each month, moreover, it does not have the date

we are going to read data from an url, as data is in “dat” format, we using “scan” function to read it as below :

the result of runnig this code in Rstudio: “168 items” items have been found.

2-Convert to Timeseries Object

to work with timeweires data first we should convert them in a Time series Object by using “TS” function in Rstudio. TS function get the data and convert a numeric vector into an R time series object.  in this example, we just add the data with out the time into “births” variable. However, we need the date like start and enddate. 

TS function also gets some inputs such as: “Frequency” “Start” and “End” : TS(Data, frequency, start, end)

Frequency:as its name said, it look for the number of intervals for stored data, for instance for a year we set the value as 12, for quarter we set value as 4.

for instance for number of birth in Newyork we should write below codes to convert data into Timeseries :

the births is the data that we collected, the frequency is 12 as in each year we have 12 months, from Jan to Dec. Moreover,  the start of the data was from 1946 Jan, so we have c(1946,1) as a vector for start date.

now I am using the “Plot” function to draw the time series data as below

the chart will be as below :

birth

 

Before heading to analysis the chart, lets look at the other example:

I am going to look at the data for milk production for each month from 1962 to 1975, we are going to draw a time series plot for this data.

first I import data into R studio as below

then we convert tha value into a timeseries object and plot it

the result will be as below

milk

 

However, the timing is not correct, so I am going to add frequency and start and end time to the data as below

so, the chart will be like below :

milk2

3-Timeseries Components

as you can see in above charts, these charts talk about different things in one picture

1- Trend

2- Seasonality

3- Irregular component

Trend

trend is about “long-term increase or decrease in the data” . for instance in the milk production we can see there is increase trend in production.

Seasonality

A seasonal pattern when data is influence by seasonal or any order. for instance, in above picture, you see in all years in the second quarter  milk production is high and then in the third quarter is the lowest one (see below ), and this trend is same in all years

quarter

milk2

Irregular component

there is no trend, seasonality in data

Combinations

In some of time series example we able to see both seasonality and trend (see the below picture).

 

trendseasonal

we able to decompose these components:

Decompose non seasonal Data: Trend data +irregular Data

Decompose Seasonal Data : Seasonal Data +Irregular Data

In the above picture, we have both Trend and Seasonality data. (charts shows an increase rate and also a seasonal pattern)

so we able to decompose them using a command name “Decompose”

we got the below charts

decompose

as you can see in the above picture, the milk production has 3 main parts : trend, seasonal and random.

in the next part I will talk about the timeseries models more deeply.

 

[1] Book:http://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

[2] data about the birthrate:

[3] data about the milk production : https://datamarket.com/data/set/22ox/monthly-milk-production-pounds-per-cow-jan-62-dec-75#!ds=22ox&display=line

[4]https://onlinecourses.science.psu.edu/stat510/?q=node/70

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Leila Etaati

Dr. Leila Etaati is Principal Data Scientist, BI Consultant, and Speaker. She has over 10 years’ experience working with databases and software systems. She was involved in many large-scale projects for big sized companies. Leila has PhD of Information System department, University of Auckland, MS and BS in computer science. Leila is Microsoft Data Platform MVP.


2 Comments

  • Leila,
    I am really interested in playing with this, but I am not sure where to start. Where are you when you start the first step above? Are you just creating an R script? If so, where? Outside of Power BI? Warning: I know only a smidge of R
    PS – half your pictures in this post are not visible.

    • Sure, if I want to explain the steps
      the first step is to visualize your data by converting it into a timeseries object, then check it wether it has trend or seasonality, also check the acf and pacfchart for it to decide using exponential smoothing or Arima….definatly first you should use R scripts

Leave a Reply

Your email address will not be published. Required fields are marked *