3.1.2 Missing Data

It is possible to represent missing data explicitly in Octave using NA (short for “Not Available”). This is helpful in distinguishing between a property of the data (i.e., some of it was not recorded) and calculations on the data which generated an error (i.e., created NaN values). In short, if you do not get the result you expect is it your data or your algorithm?

The missing data marker is a special case of the representation of NaN. Because of that, it can only be used with data represented by floating point numbers—no integer, logical, or char values.

In general, use NA and the test isna, to describe the dataset or to reduce the dataset to only valid entries. Numerical calculations with NA will generally "poison" the results and conclude with an output NA. However, this can not be guaranteed on all platforms and NA may be replaced by NaN.

Example 1 : Describing the dataset

data = [1, NA, 3];
percent_missing = 100 * sum (isna (data(:))) / numel (data);
printf ('%2.0f%% of the dataset is missing\n', percent_missing);
-| 33% of the dataset is missing

Example 2 : Restrict calculations to valid data

raw_data = [1, NA, 3];
printf ('mean of raw data is %.1f\n', mean (raw_data));
-| mean of raw data is NA
valid_data = raw_data (! isna (raw_data));
printf ('mean of valid data is %.1f\n', mean (valid_data));
-| mean of valid data is 2.0
 
x = NA
x = NA (n)
x = NA (m, n, …)
x = NA ([m, n, …])
x = NA (…, class)
x = NA (…, "like", var)

Return a scalar, matrix, or N-dimensional array whose elements are all equal to the special constant NA (Not Available) used to designate missing values.

Note that NA always compares not equal to NA (NA != NA). To find NA values, use the isna function.

If called with no arguments, return the scalar value NA.

If invoked with a single scalar integer argument n, return a square NxN matrix.

If invoked with two or more scalar integer arguments, or a vector of integer values, return an array with the given dimensions.

The optional argument class specifies the class of the return array. The only valid options are "double" (default) or "single".

Programming Note: The missing data marker NA is a special case of the representation of NaN. Numerical calculations with NA will generally "poison" the results and conclude with an output of NA. However, this can not be guaranteed on all platforms and NA may be replaced by NaN. See Missing Data.

See also: isna.

 
tf = isna (x)

Return a logical array which is true where the elements of x are NA (missing) values and false where they are not.

For example:

isna ([13, Inf, NA, NaN])
     ⇒  [ 0, 0, 1, 0 ]

See also: isnan, isinf, isfinite.