Why Unions didn't work

The Cell Store; Expressions and their Representation

Every expression in Borg is represented by a 4 byte integer. This integer contains all data needed to effectively access the data in the value.

typedef union { 
	_HDR_TYPE_ hdr;
	_PTR_TYPE_ ptr;
	_CNT_TYPE_ cnt;
	_UNS_TYPE_ uns;
	_NBR_TYPE_ nbr;
	_NBU_TYPE_ nbu;
	_SWI_TYPE_ swi; } _EXP_TYPE_;

typedef struct { 
	_SGN_TYPE_ nbr: 31;
	_UNS_TYPE_ ptr: 1; } _NBR_TYPE_;
typedef struct { 	
	_UNS_TYPE_ nbr: 31;
	_UNS_TYPE_ ptr: 1; } _NBU_TYPE_;

The most significant bit in a value is set (the ‘ptr’ field in the records) whenever the value itself represent a pointer or a header. If this bit is set to 0 the expression contains (in its 4 bytes) a number. In Borg, this number can be treated signed or unsigned. There is no signed/unsigned type information stored in an expression. So if one does a _AG_GET_NBR_ he will get a signed number. If he obtains the number through _AG_GET_NBU_ one will get an unsigned number. Normally all expressions created by the parser and used troughout the running program are signed integers. Unsigned numbers are only used to give the sizes of tables and to index tables.

typedef struct { 
	_UNS_TYPE_ pfx: 29;
	_UNS_TYPE_ sta: 3; } _SWI_TYPE_;

The _SWI_TYPE_ is used by the garbage collector. The ‘sta’ field contains the status of the current cell. The first bit contains whether the expression is a header of a table or another structure. The second bit contains whether the expression is busy (marked) and the last bit says whether the expression is a value or not.

If an expression is marked to be a pointer, it can be a pointer to a table, or it can be a pointer to a raw data structure (which is incomprehensible to te garbage collector).

typedef struct { 
	_UNS_TYPE_ siz: 24;
	_UNS_TYPE_ tag: 5;
	_UNS_TYPE_ hdr: 1;
	_UNS_TYPE_ bsy: 1;
	_UNS_TYPE_ ptr: 1; } _HDR_TYPE_;

typedef struct { 
	_UNS_TYPE_ ofs: 29;
	_UNS_TYPE_ hdr: 1;
	_UNS_TYPE_ bsy: 1;
	_UNS_TYPE_ ptr: 1; } _PTR_TYPE_;

exp.ptr.ofs points to the position in the memory store where the contents of the expression resides. So, if we want to access element 12 in the table ‘tbl_exp’ we can write mem_STORE[tbl_exp.ptr.ofs+12].

To differentiate between a pointer to a table and a pointer to a raw expression we look at the least significant bit of the tag. (exp.hdr.tag). If this bit is 1, the pointer points to raw data. If this bit is 0, the pointer points to a table.

What about Different Architectures ?

Every expression in borg consists of a union which contains different structures.

typedef struct { _UNS_TYPE_ siz: 24;
_UNS_TYPE_ tag: 5;
_UNS_TYPE_ hdr: 1;
_UNS_TYPE_ bsy: 1;
_UNS_TYPE_ ptr: 1; } _HDR_TYPE_;

typedef struct { _UNS_TYPE_ ofs: 29;
_UNS_TYPE_ hdr: 1;
_UNS_TYPE_ bsy: 1;
_UNS_TYPE_ ptr: 1; } _PTR_TYPE_;

Macintosh, codewarrior release 4 bitfields

Uns#31-8	Uns#7-3 	Uns#2 		Uns#1 		Uns#0
Nbr.nbr#30-7 	Nbr.nbr#6-2 	Nbr.nbr#1 	Nbr.nbr#0 	Nbr.ptr
Ptr.ofs#28-5 	Ptr.ofs#4-0 	Ptr.hdr 	Ptr.bsy 	Ptr.ptr
Hdr.siz#23-0 	Hdr.tag#4-0 	Hdr.hdr 	Hdr.bsy 	Hdr.ptr
Swi.pfx#28-5 	Swi.pfx#4-0 	Swi.sta#2 	Swi.sta#1 	Swi.sta#0

Visual C, windows bitfields

Uns#31 		Uns#30 		Uns#29 		Uns#28-24 	Uns#23-0
Nbr.ptr 	Nbr.nbr#30 	Nbr.nbr#29 	Nbr.nbr#28-24 	Nbr.nbr#23-0
Ptr.ptr 	Ptr.bsy 	Ptr.hdr 	Ptr.ofs#28-24 	Ptr.ofs#23-0
Hdr.ptr 	Hdr.bsy 	Hdr.hdr 	Hdr.tag#4-0 	Hdr.siz#23-0
Swi.sta#2 	Swi.sta#1 	Swi.sta#0 	Swi.pfx#28-24 	Swi.pfx#23-0

Solaris, bitfields

Uns#31-8     	Uns#7-3    	Uns#2      	Uns#1      	Uns#0
Nbr.nbr#30-7 	Nbr.nbr#6-2 	Nbr.nbr#1  	Nbr.nbr#0  	Nbr.ptr
Ptr.ofs#28-5 	Ptr.ofs#4-0 	Ptr.hdr    	Ptr.bsy    	Ptr.ptr
Hdr.siz#23-0 	Hdr.tag#4-0 	Hdr.hdr    	Hdr.bsy    	Hdr.ptr
Swi.pfx#28-5 	Swi.pfx#4-0 	Swi.sta#2  	Swi.sta#1  	Swi.sta#0

What goes wrong ?

Now, the problem with this 1) nice ''platform independent'' conversion thingy (unions) and 2) the different representations on different platforms arise when we use unions to convert one value to another.

E.g: Suppose, we have a tag field on bit 1 to 4 and we have a raw bit on bit 1, we will see different convertions on different architectures. On a macintosh, we will probably see the value 1 in de raw bit if we write an odd number to the tag field. Contrary on other architectures, we will only see the raw bit set when we write a tag larger than 7.

This problem is unbelievable hard to address because it depends on 1) the representation you think the compiler should assign, 2) the compiler choice to place the bitfields and 3) the hardware which is sometimes unable (or to slow, like eg Solaris) to access bitfields and reverse them.

So, we decided to stop using this ugly C-feature: unions.