| Summary: | [evaluator] Inadequate support for large integers | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Modeling] OCL | Reporter: | Oliver Wong <owong> | ||||||
| Component: | Core | Assignee: | OCL Inbox <mdt-ocl-inbox> | ||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||
| Severity: | normal | ||||||||
| Priority: | P3 | CC: | achim.demelt, alexander.igdalov, dvorak.radek, ed, owong | ||||||
| Version: | unspecified | Keywords: | plan | ||||||
| Target Milestone: | 3.1.0 | Flags: | ed:
indigo+
|
||||||
| Hardware: | PC | ||||||||
| OS: | All | ||||||||
| Whiteboard: | Compliance | ||||||||
| Bug Depends on: | 318048 | ||||||||
| Bug Blocks: | 156363 | ||||||||
| Attachments: |
|
||||||||
|
Description
Oliver Wong
There is support for BigDecimal and BigInteger values that originate in models, but as you observe this is missing in the concrete syntax. I think the correct solution is to check the string length and parse for a BigInteger when too many digits or too large a value for Integer. Internally the generic java.lang.Number inheritance should sort out the alternatives. *** Bug 298205 has been marked as a duplicate of this bug. *** I'm trying to support BigInteger and BigDecimal in the revised evaluator in the ReflectiveLibrary branch. The observable problem is that the parser method createIntegerLiteralExpCS uses Integer.valueOf to parse the lexed string. This is easily fixed but then a) IntegerLiteralExpCS.integerSymbol is an Integer b) IntegerLiteralExp.integerSymbol is an Integer (ditto for UnlimitedNatural, Real) Therefore we either introduce BigIntegerLiteralExp etc and surprise clients with new classes or change IntegerLiteralExp.integerSymbol to be a Number (Short,Integer,Long,BigInteger or Float,Double,BigDecimal in practice). Introducing new classes will give tightly-coupled clients run-time failures since the requisite handlers will be missing. Changing the type to Number will give tightly-coupled clients compile-time failures. I prefer the change to Number since variant IntegerLiteralExp or RealLiteralExp classes are not supported by the OCL specification, and since this exposes rather than hides the API change. *** Bug 246895 has been marked as a duplicate of this bug. *** Created attachment 155901 [details]
AST/CST change for Big numbers
The attached patch changing from Integer/Double to Number for {Integer,Real,UnlimitedNatural}LiteralExp{,CS} surprisingly does not break any OCL or QVTd test. Extra JUnit tests validate the parsing, serialisation and deserialisation of Big numbers.
The patch causes no compilation problems for QVTo. I am unable to tell whether QVTo run-time is affected; 790/815 tests appear to fail in my workspace.
NB. Previous behaviour was to NFE on big numbers.
The new behaviour parses, serialises and deserialises big numbers.
Maths on Big numbers is poorly supported. The rewrite of operations in the ReflectiveLibrary branch will fix this.
Temporarily some high precision Double numbers will be parsed as BigDecimal causing failure when used in mathematical operations.
(Patch seems to have a few missing @since's.) If we're changing the type of integerSymbol and realSymbol anyway, we could introduce new wrapper types org.eclipse.ocl.types.Integer and org.eclipse.ocl.types.Real that encapsulate the choice of representation type. This would localize some of the Double or BigDecimal logic. [I'm undecided whether this is a good idea, since it creates an extra object for every numeric literal and is not backwards compatible with the serialised AST, so I raise this possibility inviting positive/negative comments.] We probably need some xxxOption to allow users to control the threshold at which arithmetic promotes from Double to BigDecimal (Integer promotion is easy; grow on overflow). The patch chooses the precision of a litertal based on whether the string representation has 15 or more letters so -1.0 is Double and -1.000000000e00 is BigDecimal. When to promote Double / Double to BigDecimal is very subjective. BigDecimal has a precision, so choosing an appropriate precision is also subjective. The test_supportForELongAttributes_198451 provides some insight into the existing behaviour. Integer, Long, BigInteger, Double and BigDecimal are preferred representations. Byte, Short, Float are tolerated representations that are converted to Integer, Double. The test assertEquals(1, evaluate(helper, thing, "numeros->at(1).asLong() - numeros->at(2).asLong()")); passes because the Long - Long maths returns Long but the result which is 1 is passed through NumberUtil.coerceNumber which tries to find a simpler representation and converts long 1 to integer 1 which is equal to the asserted integer 1. This behaviour is a little expensive but well-defined. The patch extends this support for arbitrary intermediate results to arbitrary persisted results. We just want to refine the new createRealNumber to use BigDecimal for anything with any fractional digits or more than 15 integer digits and to coerceNumber the result back to Double where possible. No change to createIntegerNumber which is already 'optimal'. [0.1 cannot be precisely represented by Double, whereas it is trivial with BigDecimal.] Further:
assertEquals((int) maxIntMinusOne, evaluate(helper, thing,
String.format("(%d + %d).div(2) - 1", maxInt, maxInt)));
passes when adding an Integer overflow because all numeric arguments are promoted to a higher precision before any numeric operation.
Ow! we are incurring a promotion cost on every numeric argument, and a demotion cost on every numeric result.
Ow! we are losing precision on decimals; bug 299477.
Ow! we are not detecting long overflows; Java does not report arithmetic issues.
Given the costs we are incurring through conversions, we might as well just use BigInteger and BigDecimal throughout. The AST once again has narrow types.
Created attachment 156100 [details]
Use of BigDecimal and BigInteger internally
[Attached patch applies to latest CVS, in which generated code has been regenerated to avoid reviewing 100 files of gratuitous differences on this patch.]
Changing the internal and persisted representations of Real and Integer to BigDecimal and BigInteger has a bit more impact than changing to Number, but leads to more predictable simpler behaviour and solves most of the problems other than extensibility and legibility. API Compatibility is enhanced for scalar values by invoking NumberUtil.coerceNumber only on the final result rather than every intermediate calculation.
[Since UnlimitedNaturalLiteralExp.intergerSymbol is changing type, we might as well track Issue 14196 and change its spelling at the saem time.]
Impact on tests: non-scalar result checks must use BigDecimal or BigInteger.
Impact on QVTd: two one-line changes are required to resolve compilation errors.
Impact on QVTo: no changes needed for compilation problems.
Visible change:
RealLiteralExp(CS).realSymbol is BigDecimal rather than Double.
IntegerLiteralExp(CS).integerSymbol is BigInteger rather than Integer.
UnlimitedNaturalLiteralExp(CS).integerSymbol is BigInteger rather than Integer and name unlimitedNaturalSymbol.
setRealSymbol, setIntegerSymbol provide some compatibibility for original signatures.
Scalar return from OCL.evaluate unchanged because coerceNumber invoked on result.
Collection/Tuple query results contain BigDecimal or BigInteger rather than Double, Integer.
New: OCL.rawEvaluate supports return of BigDecimal or BigInteger bypassing coerceNumber.
Hidden changes:
oclAsType now works and is tested for Integer/Real conversions.
BigInteger and BigDecimal internal results should not throw exceptions
-* now works and is tested
Ed, When trying apply the patch with in the ReflectiveLibrary branch I have a lot of problems. Fewer problems (just some merge problems in in a tests plugin) while applying against HEAD, so I guess that "latest CVS" = HEAD. Looking into it. Cheers, Adolfo. Yes. The patch is for HEAD. I have just committed the change to the ReflectiveLibrary Branch which has now caught up with LPG2 etc. The Ecore tests run with about 55 failures; to be investigated. The UML tests do not yet run. Switching to BigDecimal/BigInteger makes the revised approach much easier too. Take a peek at org.eclipse.ocl.evaluator.operations to see how modular it now is. It would be good to have a preliminary approval that we want to change to BigInteger/BigDecimal always. Then perhaps the nitty gritty review can be a little slower. The patch probably fixes 20 to 50 obscure problems, but may well creates nearly as many similarly obscure problems. It is only with the ReflectiveLibrary that we can really clear these problems. I expect to clear at least 15 bugzillas in one go. Ed, To be honest, I'm learning rather than doing a nice reviewing with this bugzilla, so my contribution on this is being simply reduced to catch some noise. Let's see if I can help ;). About the approach. In principle, (and you know my position about MDT-OCL 3.0.0), I'm not worried about any inconvenience related to breaking API, or any backwards compatibility. Actually, we must take advantage to improve the implementation, and it seems that you are fixing a lot of numeric conversions related nuisances. So, 1. I'm happy with the solution (my +1 to the approach) 2. I wouldn't include "setRealSymbol, setIntegerSymbol methods to provide some compatibibility for original signatures", since - it partially provides compatibility. - I don't like the not generated code. So if the API need to be broken for the expression symbol, let's learn clients to use the new one instead of including partial compatibility solutions. Apart from that, have a look to the following comments - In the OCLCST.ecore model, the references to the BigInteger/Double are established via platform:/plugin/.... URI instead of http://www.eclipse.org/... . This unnecessary change leads to, for instance: - Load the same Ecore package twice (you can check the Sample core editor, which provides you two EBigIntegers - A new usedGenPackages entry in the genmodel. I feel this may be fixed, and the code regenerated again. - The same for OCL.ecore - Number still remains in OCL.uml. - What happened with the "user_doc" in the AST generated code?, the "This value may change when the model code is regenerated. It is subject to change without notice" comment, which I think has recently added, is not longer present. - In ObjectUtil.equals method, why are we considering that Util method won't receive any Integer, Long or Double object (They have been removed), if you feel that they shouldn't be used along this util class, why not removing any consideration in other methods such as hashCode or isPrimitive ? - I find a missing @since in ExpressionsPackageImpl.getUnlimitedNaturalLiteralExp_UnlimitedNaturalSymbol() method. Have a nice weekend, Adolfo. Thanks for the review/observations: The plugin references seem toi be a limitation of the UML codegen. I can't see any difference between EDouble and EBigDecimal in the UML or Ecore profile so ... ? This code has evolved slightly to provide better compatibility in the ReflectiveBranch. I'm no longer inclined to commit this for 3.0.0 if the ReflectiveLibrary doesn't also go in 3.0.0. (In reply to comment #13) > I'm no longer inclined to commit this for 3.0.0 if the ReflectiveLibrary > doesn't also go in 3.0.0. I agree. Let's include it into 4.0.0. ----------- Unlike Java (and other multipurpose languages) there is no variety of different-precision types for numbers in OCL. I have no clear opinion on whether we should we submit a request to OMG to add types representing Java's int, long and double to OCL. Perhaps, it is ok for OCL to be slower but less bloated? The OCL evaluation exhibits significant control-code bloat so that the arithmetic overhead is not that bad. I judged, without measurement, that the costs of regular precision conversions in the current code were comparable and perhaps worse than the cost of always using BigXXX. When a code Synthesis tool is used to produce efficient Java/C++/VHDL/... it will benefit from a numeric analysis to determine the numeric bounds of intermediate results and so allow synthesis of short or whatever. OCL should perhaps allow OCL definition rather than augmentation of classes so that I can specify def: MyShort conformsTo Integer inv: -32768 <= self and self <= 32767 The new evaluator use BigInteger and BigDecimal throughout behind IntegerValue and RealValue. Resolved for Indigo is 3.1.0 not 3.2.0. Bug 366368 imptoves the legacy support so that Long selectively working is improved to probably fully working. In particular numeric and tuple literals may be Long. Arithmetic corners for Integer/Long are tested. Closing all bugs resolved in Indigo. |