VecTcl

Other implementation issues

Release .zip Release .tar.gz Other versions

Other implementation issues

Parsing nonfinite and complex numbers

The first version of VecTcl used the official API Tcl_GetDoubleFromObj() to (attempt to) extract a double value from a string. This has proven to be insufficient, because of the handling of the IEEE not-a-number entity NaN. Therefore, the code for Tcl_GetDoubleFromObj() had to be copied from the Tcl core and modified to pass an IEEE NaN back to the caller.

The current handling of IEEE non-finite floating point entities is inconsistent in Tcl. While +/-Inf is allowed both as the result of a computation and as a string format, NaN is not. The string “NaN” is correctly translated into a quiet NaN in IEEE format upon shimmering to a double value, but taking a NaN out of a double using Tcl_GetDoubleFromObj() results in an error. This behaviour is mostly undesirable in a vector computation. There are different use cases of NaN values. One is to indicate an arithmetically undefined value, such as dividing by zero or taking the real logarithm of a negative value. Another case is to deliberately insert NaNs into a sequence of numbers to indicate missing values. The third is to use it as a debug tool when writing the software.

In the first two cases, an exception is not desired as soon as a NaN appears, because all other components of a vector might still contain valid computation results. A fast function plotter, for instance, could be implemented by an elementwise vector computation. If we tried to plot the sinc function,

vexpr {
	x= linspace(-10,10,1001)
	y= sin(x)./x
}

and a single invalid value would throw an exception instead of producing NaN, the function plotter would be unable to display the correctly computed 1000 values because of the single NaN which appears at x=0. Worse, hitting the exception depends on the exact (bit-for-bit) value of the boundaries of the plotting interval and number of datapoints.

The second use case of NaN arises when data are statistically analysed. Sometimes it is convenient to just insert a dummy value, if no value can be provided, similar to NULL values in relational databases. NaN is an excellent choice, because it does not silently convert back to a valid value, which might happen with ad-hoc dummy values such as 0, 1e10 or similar. For the backend implementation, it is also convenient because no additional bits have to be reserved and the execution speed is not impacted by many checks for NULL values, because NaNs are handled in hardware on IEEE conforming machines.

Signalling NaNs are useful as a debug tool to locate the point at which a program goes wrong, but they make it harder to do write stable software, because the programmer is forced to wrap an exception catching block around every single expression, instead of deferring error handling to a later stage. Therefore VecTcl chooses to not signal, when NaN is encountered. For similar reasons, hardware exceptions on NaN are rarely enabled in compiled programs, since an end-user program should never crash when an illegal arithmetic value operation is encountered. In order to debug a program which produces NaN for no obvious reasons, a global switch can be imagined which enables NaNs to signal an exception. For production use, this switch will likely be turned off in order to enable exception handling at higher levels, and for each component of a computation individually.

Implementing this choice as an extension is challenging. Tcl’s internal number parsing function TclParseNumber() is not exposed, but has a few nice properties that make it very suitable for a vector extension. Therefore, VecTcl uses the Tcl_ConvertToType() function to request the value to shimmer to double, and then extracts the value using the same code copied from Tcl_GetDoubleFromObj(), modified to pass back NaN without throwing an error. This is quite fragile, it will break as soon as the NaN handling in the core changes, e.g. if throwing the error migrates from Tcl_GetDoubleFromObj() into the setFromAnyProc of the double.