Matlab Statistics and Machine Learning Toolbox Exploration
This past week I explored the statistics and machine learning toolbox in Matlab. I applied basic statistics concepts I am learning at AACC. For example, I wrote a program that performs linear regression and calculates Pearson's product moment correlation and a least squares line. I spent a lot of time looking at correlations in data sets. There is a strong positive correlation between the price of the OPEC crude oil bundle and the exchange rate of the Venezuelan Bolivares. There is also a historical strong positive correlation between the exchange rate of the Singapore dollar and the exchange rate of the Danish kroner from 2017-April 2019. In addition, I wrote a program that performs hypothesis testing. I compared two data sets of hourly sea level measurements from Venice, Italy binning the years from roughly 1983-1998 and 1999-2015. Matlab performed a breathtaking calculation of over 300,000 data points in seconds. The conclusion was there is a statistically significant difference between the two data sets. Everyone knows this to be true and I would have been worried if the calculation said there is no statistically significant difference between the data. The benefit of confirming a conclusion we know to be true is we know the program is working properly.
There are many different types of hypothesis tests to perform in Matlab.[1]
The documentation for the statistics and machine learning toolbox is robust[2]. I spent a lot of time this past week looking through data sets on Kaggle[3] and Quandl[4]. Moving forward I would like to explore the non-linear regression functions[5].
Comments