Wavelet
Wavelet analysis.
13-3
13 Data Analysis and Statistics
Column-Oriented Data Sets
Univariate statistical data is typically stored in individual vectors. The vectors
can be either 1-by- n or n-by-1. For multivariate data, a matrix is the natural
representation but there are, in principle, two possibilities for orientation. By
MATLAB convention, however, the different variables are put into columns,
allowing observations to vary down through the rows. Therefore, a data set
consisting of twenty four samples of three variables is stored in a matrix of size
24-by-3.
Vehicle Traffic Sample Data Set
Consider a sample data set comprising vehicle traffic count observations at
three locations over a 24-hour period.
Vehicle Traffic Sample Data Set
Time
Location 1
Location 2
Location 3
01h00
11
11
9
02h00
7
13
11
03h00
14
17
20
04h00
11
13
9
05h00
43
51
69
06h00
38
46
76
07h00
61
132
186
08h00
75
135
180
09h00
38
88
115
10h00
28
36
55
11h00
12
12
14
12h00
18
27
30
13h00
18
19
29
13-4
Column-Oriented Data Sets
Vehicle Traffic Sample Data Set (Continued)
Time
Location 1
Location 2
Location 3
14h00
17
15
18
15h00
19
36
48
16h00
32
47
10
17h00
42
65
92
18h00
57
66
151
19h00
44
55
90
20h00
114
145
257
21h00
35
58
68
22h00
11
12
15
23h00
13
9
15
24h00
10
9
7
Loading and Plotting the Data
The raw data is stored in the file, count.dat.
11 11 9
7 13 11
14 17 20
11 13 9
43 51 69
38 46 76
61 132 186
75 135 180
38 88 115
28 36 55
12 12 14
18 27 30
18 19 29
17 15 18
19 36 48
13-5
13 Data Analysis and Statistics
32 47 10
42 65 92
57 66 151
44 55 90
114 145 257
35 58 68
11 12 15
13 9 15
10 9 7
Use the load command to import the data.
load count.dat
This creates the matrix count in the workspace.
For this example, there are 24 observations of three variables. This is
confirmed by
[n,p] = size(count)
n =
24
p =
3
Create a time vector, t, of integers from 1 to n.
t = 1:n;
Now plot the counts versus time and annotate the plot.
set(0,'defaultaxeslinestyleorder','-|--|-.')
set(0,'defaultaxescolororder',[0 0 0])
plot(t,count), legend('Location 1','Location 2','Location 3',2)
xlabel('Time'), ylabel('Vehicle Count'), grid on
The plot shows the vehicle counts at three locations over a 24-hour period.
13-6
Column-Oriented Data Sets
300
Location 1
Location 2
Location 3
250
200
150
Vehicle Count
100
50
00
5
10
15
20
25
Time
13-7
13 Data Analysis and Statistics
Basic Data Analysis Functions
This section introduces functions for:
• Basic column-oriented data analysis
• Computation of correlation coefficients and covariance
• Calculating finite differences
Function Summary
A collection of functions provides basic column-oriented data analysis
capabilities. These functions are located in the MATLAB datafun directory.
This section also gives you some hints about using row and column data, and
provides some basic examples. This table lists the functions.
Basic Data Analysis Function Summary
Function
Description
cumprod
Cumulative product of elements.
cumsum
Cumulative sum of elements.
cumtrapz
Cumulative trapezoidal numerical integration.
diff
Difference function and approximate derivative.
max
Largest component.
mean
Average or mean value.
median
Median value.
min
Smallest component.
prod
Product of elements.
sort
Sort in ascending order.
sortrows
Sort rows in ascending order.
std
Standard deviation.
13-8
Basic Data Analysis Functions
Basic Data Analysis Function Summary (Continued)
Function
Description
sum
Sum of elements.
trapz
Trapezoidal numerical integration.
For information about calculating the maximum, minimum, mean, median,
range, and standard deviation on plotted data, and creating plots of these
statistics, see “Adding Plots of Data Statistics to a Graph” in the MATLAB
graphics documentation.
Working with Row and Column Data
For vector input arguments to these functions, it does not matter whether the
vectors are oriented in row or column direction. For array arguments, however,
the functions operate column by column on the data in the array. This means,
for example, that if you apply max to an array, the result is a row vector
containing the maximum values over each column.
Note You can add more functions to this list using M-files, but when doing so,
you must exercise care to handle the row-vector case. If you are writing your
own column-oriented M-files, check other M-files; for example, mean.m and
diff.m.
Basic Examples
Continuing with the vehicle traffic count example, the statements
mx = max(count)
mu = mean(count)
sigma = std(count)
result in
mx =
114 145 257
mu =
32.0000 46.5417 65.5833
13-9
13 Data Analysis and Statistics
sigma =
25.3703 41.4057 68.0281
To locate the index at which the minimum or maximum occurs, a second output
parameter can be specified. For example,
[mx,indx] = min(count)
mx =
7 9 7
indx =
2 23 24
shows that the lowest vehicle count is recorded at 02h00 for the first
observation point (column one) and at 23h00 and 24h00 for the other
observation points.
You can subtract the mean from each column of the data using an outer product
involving a vector of n ones.
[n,p] = size(count)
e = ones(n,1)
x = count - e*mu
Rearranging the data may help you evaluate a vector function over an entire
data set. For example, to find the smallest value in the entire data set, use
min(count(:))
which produces
ans =
7
The syntax count(:) rearranges the 24-by-3 matrix into a 72-by-1 column
vector.
13-10
Basic Data Analysis Functions
Covariance and Correlation Coefficients
MATLAB’s statistical capabilities include two functions for the computation of
correlation coefficients and covariance.
Covariance and Correlation Coefficient Function Summary
Function
Description
cov
Variance of vector – measure of spread or dispersion of
sample variable.