June spec review #155

waldemarhorwat · 2024-06-11T08:57:46Z

I took another look at the Decimal spec for the June meeting. The spec forms a good foundation, but I found some pervasive issues that make it self-contradictory or produce incorrect results and should be addressed before seeking stage 2.

My review follows. Some of the items are nontrivial and will require some discussion and agreement on how to fix. I'm offering to provide whatever help is needed to understand and fix these.

Confusion

Throughout the spec there are errors where a subroutine is called with different types of arguments or different data than what it expects. Document what the arguments should be and check all call sites to make sure that they match. One example is EnsureDecimal128Value, where part of the spec assumes that the first argument is the full mathematical value (which I believe to be the correct interpretation), while another part of the spec assumes that it's a mantissa. In some places multiple bugs overlap: The Decimal128 constructor confuses parse nodes, which causes Decimal128("3.14") to return 33.14_𝔻; if the parse node confusion is fixed, then Decimal128("3.14") will return 314_𝔻 due to EnsureDecimal128Value argument type confusion.
Handling of zeroes is incorrect in nearly all arithmetic operations due to the above type confusion. For example, Decimal128.prototype.add tries to do real number arithmetic on +0_𝔻 and -0_𝔻, which are not real numbers. IsIntegralDecimal128 returns false on +0_𝔻 and -0_𝔻.
All operations that produce new zeroes via arithmetic either throw or encounter the above type confusion.
Decimal128ToDecimalString confuses a value with its length and contains buffer overruns.

Rounding

None of the rounding modes is implemented.
The spec throws whenever an inexact operation happens. Dividing 1_𝔻 by 3_𝔻 will throw, as will adding 10⁵⁰_𝔻 to 1_𝔻. This is a major change to the spec that I hadn't been expecting. Is this intentional?

Quantum Handling

Arithmetic on the quantum is incorrect in many places. If we're going to distinguish cohort members, we should have arithmetic operations set the quantum in the same way as the IEEE spec does it.
The spec neglects to set the quantum on zero values in methods such as Decimal128.prototype.round.
Per the IEEE spec, the quantum should not affect mathematical results. Interrogating the quantum in methods such as IsIntegralDecimal128 is incorrect.
The quantum appears in other places where it shouldn't, I'm guessing due to the confusion above. toPrecision with a user-provided precision shouldn't interrogate the quantum. It's not the exponent.

Canonicalization

The spec confuses string canonicalization with Decimal128 value canonicalization. Those are quite different concepts:
What we want is for toString to produce strings independent of the input quantum unless one explicitly sets canonicalize to false: All Decimal values with the same v should produce the same string.
The spec attempts define a CanonicalizeDecimal128 helper method that tries to pick a specific "canonical" Decimal128 value to represent a cohort, but that doesn't work:
- The choices are to pick either the one with the highest possible q, or the one with the lowest possible q. That's what's needed in some arithmetic contexts.
- Neither choice works correctly for both toString and toExponential.
To fix this, restructure the spec to delete CanonicalizeDecimal128 and anything like it and just make the conversions not depend on the quantum unless the user requests it.

Comparisons

The compare method should compare mathematical values and return one of -1, 0, 1, or NaN. The current implementation also compares the quantum, which turns it from a generally useful method to an extremely obscure one that should be omitted in the first version.
A method for the < comparison is provided, but >, ≤, ≥ are missing, and the latter two are nontrivial for users to implement correctly. Naively turning a ≤ b into !(b < a) is incorrect due to NaN's.

Conversions

Invoking toString on 10¹⁰⁰⁰_𝔻 produces 1 followed by 1000 zeroes. This is fine if printing a BigInt, but is undesirable for floating-point numbers because it provides a false sense of precision. Users will expect that they can add 1 to it, and, being a Decimal, get 100…001, which obviously won't work. Instead, toString should follow existing Number precedent and switch to scientific notation to not produce significantly more digits than is possible to represent in a Decimal128 value. If someone really wants fixed-point notation, they can use a different method such as toFixed.
One would expect calling toString with canonicalize set to false to not canonicalize the output, but that's not what it does. It will sometimes canonicalize and sometimes not, depending on what Decimal128 value it sees. So canonicalize set to false actually means "maybe". It should mean "don't canonicalize".
None of the conversions respect the quantum of zero values. But they should: if canonicalize is set to false, then we do want to distinguish 0 from 0.0000 and 0e75.
The quantum of large Numbers converted to Decimal128 is inconsistent. Converting 1e20 yields a quantum of 0, but converting 1e21 yields a quantum of 21. As a fix, I recommend doing the logical equivalent of always converting Numbers to Decimal128 via scientific notation.

ECMAScript Mechanics

ECMAScript object identity is inconsistent. Some methods such as Decimal128.prototype.round sometimes reuse input values and sometimes produce fresh values equal to the input values. We should be consistent here.
Number(x) now calls toString on every x, of any type. This is an undesirable change in existing behavior and not necessary for correct Decimal128 semantics.
The spec uses the wrong grammar for converting strings to Decimal128. As a result, Decimal128("07") throws.

Missing Operations

We should provide scale and some form of exponent, mantissa methods.
We should provide square root.

Structure

The introduction section uses concepts such as "zero" and "finite" before defining them.
For readability I'd prefer to use spec methods such as cohort and quantum on Decimal128 values rather than [0] and [1].

The text was updated successfully, but these errors were encountered:

jessealama · 2024-06-11T10:19:11Z

Thanks for taking a look! I'll get to work on these issues. Stay tuned.

waldemarhorwat · 2024-06-11T10:22:21Z

I added a sketch of how to fix things.

jessealama · 2024-06-11T13:09:30Z

(This was unintentionally closed in #156 )

jessealama · 2024-06-12T06:55:31Z

Thanks so much for this second round of detailed review! We plan to make changes to address almost all of your points below. Specifically:

Type errors

Thanks, this is a good guide. We'll annotate the specification text with more clear expectations about the types of various values, which we hope will help avoid this issue in the future.
The changes for quantum handling will address the type errors.

Rounding

It was not intentional to make rounding throw. Throwing an error there was a placeholder to come back and describe the operation.
We'll write this description with the same semantics as IEEE-754 and Intl (they already coincide via that table).

Quantum handling

Your suggestion to replace EnsureDecimal128Value with PickQuantum is an excellent one, and we'll rewrite our algorithms in that form.
The current data model does not handle different quanta for zero values, so of course algorithms don't handle them. We will have to add this.
To distinguish multiple quanta for +-0, we need to treat them as cohorts, rather than full decimal values. To stick with the convention where the 𝔻 subscript means that the whole value is a Decimal128 value, we'll name these cohorts PositiveDecimal128Zero and NegativeDecimal128Zero, and treat these cohorts specially in all arithmetic operations.
It was not intentional to differ from IEEE 754 in determining the quantum. We'll match the spec.
We'll rephrase the arithmetic operations to be based on the mathematical value of the operation applied to the operand cohorts, which is then passed to an algorithm to round it to the nearest cohort.

Canonicalization

We'll delete the canonicalize algorithm–as you say, we just need to get the cohort when we need the mathematical value, which toString and friends will use.

Comparisons

We'll remove the .compare function (i.e., a function that can distinguish mathematically equal decimals). We don't have it for Number or BigInt, and people get by. You can use the coefficient or quantum accessors (which we will add) if you want to distinguish mathematically equal values.
We're not sure whether greaterThan, lessThanOrEquals, etc. are necessary to add, but they are easy to define; let's discuss the tradeoff in plenary. Our current thinking is to resurrect an earlier idea, of having a single, differentcompare method, working with mathematical value, from which <, ≤, >, ≥ are trivially defined in user code.

Conversions

We'll change the default toString algorithm to overflow to exponential notation using the same cutoffs Number uses.
We'd like to replace the "canonicalize" option for toString with a separate operation that can be used just to convert a decimal value to an exact string that cleanly represents the decimal in a natural way, e.g., using exponential notation or trailing zero decimal points based on the quantum (e.g., could be "300", "3e2", or "300.00" depending on the member of the cohort, and may have 6000 zeroes after it).
Then, there will be no canonicalize flag for the other string operations, simplifying them.
For conversion from Number, we'll apply your fix of always using the quantum as if it were written in exponential notation.

ECMAScript mechanics

We'll consistently return new object identities
We'll delete that stray line of spec that does that conversion, which is not useful or a good idea
We'll define a new grammar for this particular case, which permits leading zeroes

Missing operations

We'll define methods to get the coefficient (as a BigInt) and quantum (as a Number)
Sqrt: This is easy to define, but let's discuss with the committee whether it's desired. We don't understand the use case

Structure

Good suggestions; we've applied them

waldemarhorwat · 2024-06-13T04:44:05Z

Thank you!

Some comments:

"The current data model does not handle different quanta for zero values, so of course algorithms don't handle them. We will have to add this.": I don't understand this response. The current data model does support different quanta for zero values. That's what the q in «+0_𝔻, q»_𝔻 and «-0_𝔻, q»_𝔻 is.
Renaming the +0_𝔻 and -0_𝔻 symbols sounds good to reduce confusion, but PositiveDecimal128Zero and NegativeDecimal128Zero are a bit of a mouthful. Can we use something shorter such as PlusZero and MinusZero? There's no need to include "Decimal128" in the name — note that all the other finite cohorts such as 1.5 and 42 are represented by real numbers that don't have a Decimal128 tag.
I think it's essential to provide the five comparison methods that do the =, <, >, ≤, and ≥ comparisons. Not having those would make decimal code and algorithms awkward to translate into Decimal128 methods.
A compare method that compares mathematical values and returns one of {-1, 0, 1, NaN} is a very-nice-to-have.
The Decimal128 toString threshold for switching to scientific notation should be higher than for Number due to Decimal128's higher precision. For Number we intentionally picked a threshold greater than its 17 digits of precision. Decimal128 offers 34 digits of precision, so it's reasonable to switch to scientific notation for large numbers when the magnitude exceeds 10³⁴. When adding leading zeroes for small numbers, the same threshold as Number (10^-6) seems fine.
There are existing grammars in the language for converting user-provided strings to numbers. See StringNumericLiteral. Use that.

waldemarhorwat · 2024-06-14T21:16:44Z

I added a proposed design of the methods for querying and working with exponents, significands, and the quantum, along with sample code for many use cases.

waldemarhorwat mentioned this issue Jun 11, 2024

Details of specifying Decimal numbers and algorithms #122

Open

jessealama mentioned this issue Jun 11, 2024

Use "cohort of" and "quantum of" rather than array indices #156

Merged

jessealama linked a pull request Jun 11, 2024 that will close this issue

Use "cohort of" and "quantum of" rather than array indices #156

Merged

jessealama closed this as completed in #156 Jun 11, 2024

jessealama reopened this Jun 11, 2024

jessealama mentioned this issue Jun 12, 2024

Respond to June feedback #158

Merged

jessealama mentioned this issue Jun 28, 2024

Match current spec jessealama/decimal128#111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

June spec review #155

June spec review #155

June spec review #155

June spec review #155

Comments

Confusion

Rounding

Quantum Handling

Canonicalization

Comparisons

Conversions

ECMAScript Mechanics

Missing Operations

Structure

Type errors

Rounding

Quantum handling

Canonicalization

Comparisons

Conversions

ECMAScript mechanics

Missing operations

Structure