changeset 31872:cfeda68b01ad

mean: Process single as double to avoid precision limits (bug #63848). * mean.m: Change sum calls not associated with large int handling codepaths to process as "double" to avoid single precision limits from producing erroneous output. Add BISTs to test calls to sum that could lose precision. Add xfail BIST for showing the same problem with large doubles. * NEWS.9.md: Update description of changes to mean.m.
author Nicholas R. Jankowski <jankowski.nicholas@gmail.com>
date Thu, 02 Mar 2023 10:57:03 -0500
parents 9546e993e9ee
children 5eb1f1a77614
files etc/NEWS.9.md scripts/statistics/mean.m
diffstat 2 files changed, 20 insertions(+), 7 deletions(-) [+]
line wrap: on
line diff
--- a/etc/NEWS.9.md	Wed Mar 01 22:16:34 2023 -0800
+++ b/etc/NEWS.9.md	Thu Mar 02 10:57:03 2023 -0500
@@ -34,8 +34,9 @@
 'all' (bug #58116), and 'vecdim' (bug #58089) options, preserve output class,
 and match expected output behavior for empty (bug #50583) and sparse inputs
 (bug #63291).  Both `median` and `mean` can handle large int values without
-overflow or precision concerns (bug #54567).  `median` also adopts the
-'outtype' option from `mean`.
+overflow or precision concerns (bug #54567), and `mean` avoids errors due to
+limits of single precision by processing as doubles (bug #63848).  `median`
+has also adopted the 'outtype' option from `mean`.
 
 ### Alphabetical list of new functions added in Octave 9
 
--- a/scripts/statistics/mean.m	Wed Mar 01 22:16:34 2023 -0800
+++ b/scripts/statistics/mean.m	Thu Mar 02 10:57:03 2023 -0500
@@ -194,7 +194,7 @@
       if (any (isa (x, {"int64", "uint64"})))
         m = int64_mean (x, 1, numel (x), outtype);
       else
-        m = sum (x) ./ numel (x);
+        m = sum (x, "double") ./ numel (x);
       endif
 
     else
@@ -210,7 +210,7 @@
       if (any (isa (x, {"int64", "uint64"})))
         m = int64_mean (x, dim, n, outtype);
       else
-        m = sum (x, dim) ./ n;
+        m = sum (x, dim, "double") ./ n;
       endif
 
     endif
@@ -246,7 +246,7 @@
           if (any (isa (x, {"int64", "uint64"})))
             m = int64_mean (x, vecdim, n, outtype);
           else
-            m = sum (x, vecdim) ./ n;
+            m = sum (x, vecdim, "double") ./ n;
           endif
 
         endif
@@ -278,7 +278,7 @@
             if (any (isa (x, {"int64", "uint64"})))
               m = int64_mean (x, 1, numel (x), outtype);
             else
-              m = sum (x) ./ numel (x);
+              m = sum (x, "double") ./ numel (x);
             endif
 
           else
@@ -303,7 +303,7 @@
             if (any (isa (x, {"int64", "uint64"})))
               m = int64_mean (x, dim, n, outtype);
             else
-              m = sum (x, dim) ./ n;
+              m = sum (x, dim, "double") ./ n;
             endif
 
             ## Inverse permute back to correct dimensions
@@ -605,6 +605,18 @@
 %!assert (mean ([1 2 3], "OmitNan"), 2)
 %!assert (mean ([1 2 3], "DOUBle"), 2)
 
+## Test limits of single precision summation limits on each code path
+%!assert <*63848> (mean (ones (80e6, 1, "single")), 1, eps)
+%!assert <*63848> (mean (ones (80e6, 1, "single"), "all"), 1, eps)
+%!assert <*63848> (mean (ones (80e6, 1, "single"), 1), 1, eps)
+%!assert <*63848> (mean (ones (80e6, 1, "single"), [1 2]), 1, eps)
+%!assert <*63848> (mean (ones (80e6, 1, "single"), [1 3]), 1, eps)
+%!assert <*63848> (mean (ones (80e6, 1, "single")), 1, eps)
+
+## Test limits of double precision summation
+%!assert <63848> (mean ([flintmax("double"), ones(1, 2^8-1, "double")]), ...
+%!                               35184372088833-1/(2^8), eps(35184372088833))
+
 ## Test input validation
 %!error <Invalid call to mean.  Correct usage is> mean ()
 %!error <Invalid call to mean.  Correct usage is> mean (1, 2, 3)