VecTcl is implemented as a two-layered system. At the bottom lies a new Tcl_ObjType, NumArray,
implemented in C together with a number of commands to manipulate it. The commands live in the
namespace numarray
and comprise elementary operations on numerical data like array shaping and
slicing, elementwise binary operators, logical comparisons, matrix product operations, linear
equation solving, elementary transcendental functions. On top of that, an expression compiler
written in Tcl transforms a sequence of mathematical expressions from infix notation into nested
function calls, which are then executed by Tcl.
NumArray is a polymorphic datatype which represents an N-rank tensor of integer, floating point or
complex floating point type. The internal representation consists of two data structures, pointed to
by the twoPtrValue-fields in Tcl_Obj. The first structure NumArraySharedBuffer
stores the data of
the NumArray in a contiguous memory buffer in native machine representation, refcounted and shared
between a NumArray and derived slices. The second structure NumArrayInfo
contains the describing
metadata
typedef struct {
NumArrayType type;
int nDim;
int bufsize;
int offset;
int canonical;
int *dims;
int *pitches;
} NumArrayInfo;
The important fields in this structure are the number of dimensions nDim
,
number of elements in each dimension dims
, the offset of the
first element into the buffer offset
, the byte increments for advancing along each dimension
pitches
, and the data type of the stored data type
.
This representation, a contiguous memory buffer together with byte increments, is the most suitable
to do fast computations and to interface with existing libraries like BLAS
and LAPACK. In the
implemented form, it allows for a wide range of array manipulations without touching the data
itself; for example, to select a subset of an array along one axis, merely the corresponding value
in dims
needs to be adjusted and the offset be recomputed, if the slice does not start from the
front. Similarly, it is possible to select every second or third element in a dimension by increasing
the pitch value, and negative pitches can be used to reverse an array. Transposition can likewise
be achieved by swapping the metadata of two axes. In accordance with the value semantics of Tcl,
VecTcl implements copy-on-write on these slices and creates shared buffers for all of the above
transformations.
The other face of NumArray is the string representation. NumArray was designed to reconcile the list
representation and the need of different dimensionality and numerical data types with EIAS. Contrary
to a common misbelief, EIAS does not exclude the existence of data types; it merely requires that an
unambiguous deserialization exists, such that a round trip from serialization/deserialization leads
to a value that behaves the same as the original, possibly having the same internal representation.
As an example, consider the standard expr
. expr
’s operators treat a string that parses as an
integer as an integer value, else interpret it as a double, and if that fails, treat it as a
non-numeric string error. In this way, every string can be unambiguously interpreted as being an
integer, a floating point value or a general string, even though every integer parses correctly as a
floating point value, too. NumArray enhances these data types with N-rank tensors and complex values. A
NumArray can formally be described with reference to Tcl lists as follows:
A NumArray is one of the following, which are tried in this order:
Using this grammar, the degree (i.e. number of dimensions), data type and number of elements of a NumArray can be unambiguously derived. Examples for the interpretation are given below
set x {1 2 3} ;# an integer vector of length 3
set y {{1.0 3.0} {3.0 5.0}} ;# a 2x2 floating-point matrix
set y {{1.0 3.0 5.0}} ;# a 1x3 floating-point matrix, i.e., a row vector
set z {0+1i 2+3.5i 3.0+0i} ;# a complex vector of length 3
set u {{1 2 3}} ;# a 1x3 integer matrix (a row vector)
set v {{{1 2} {3 4}} {{5 6} {7 8}}} ;# a 2x2x2 integer tensor
set e {1.0 2 3} ;# a floating point vector of length 3
set e1 {1.0 2 3a} ;# error: 3a can't be parsed as a number
set e2 {{1 2} 3 4} ;# error: Dimensions don't match
One peculiarity, which comes from the Tcl list representation, needs further consideration: a value with no spaces in it can alternatively interpreted as a list with a single value. And due to EIAS, a string containing a space is indistinguishable from a list consisting of two elements. This has two consequences, first single-element values (or scalars) may not contain spaces in their string representation. Otherwise, the following two NumArrays could’nt be disambiguated:
set c1 {3.0+4.0i} ;# a complex number
# with real part 3.0 and imaginary part 4.0
set c2 {3.0 +4.0i} ;# a complex vector of length 2,
# equal to {3.0+0.0i 0.0+4.0i}
Second, trailing singleton dimensions must be insignificant, i.e. a Nx1 matrix is identical to a vector of length N, a scalar is identical to a vector of length 1 and a 1x1 matrix. Fortunately, this coincides with the usual linear algebra interpretation of these entities. For two-dimensional objects, a human-understable representation can be printed using
puts [join $thing \n]
Thus, a vector is interpreted as a column vector, while a row vector must be represented as a 1xN matrix.
Following this
encoding, element retrieval from NumArrays can be done using lindex
as well as with the indexing
operator of vexpr
:
set A {{1 2} {3 4} {5 6}}
set a_11 [lindex $A 2 0] ;# 5
vexpr { a_11=A[2,0] } ;# 5
For incomplete indices, a slice is returned
set A {{1 2} {3 4} {5 6}}
set a_1 [lindex $A 2] ;# {5 6}
vexpr { a_1=A[2] } ;# should also be {5 6}
# doesn't work currently, BUG
Of course, lindex
causes shimmering if the Tcl_Obj has an internal NumArray representation, which is
particularly inefficient if the list patch is not applied (see the section on performance).