Discussion:
RfD: Second attempt at Structures
(too old to reply)
Stephen Pelc
2006-08-22 15:11:28 UTC
Permalink
I've had another go at structures, making the assumption that the
"name-first" and "name-last" camps can't agree and must coexist
somehow.

Please note that standard compliance is satisfied by provision of
source code. Forth200x standards proposals come with reference
implementations. Surely, compliance can be nowadays be provided by
a URL?

Regards, Stephen


RfD: Second attempt at Structures
22 August 2006, Stephen Pelc

20060822 Rewrite after criticism
20060821 First draft


Rationale
=========

Problem
-------
Virtually all serious Forth systems provide a means of defining
data structures. The notation is different for nearly all of them.

Current practice
----------------
Too varied to merge or agree on one set, either of names or
of stack effects. In particular there are two camps, one which
defines the name of the structure at the start, and one which
defines it at the end. Examples are:

STRUCT POINT \ -- len
1 CELLS FIELD P.X
1 CELLS FIELD P.Y
END-STRUCT

STRUCT{
1 CELLS FIELD P.X
1 CELLS FIELD P.Y
}STRUCT POINT \ -- len

Discussion
----------
The name first implementation of STRUCT and END-STRUCT is not
difficult.

: STRUCT \ -- addr 0 ; -- size
CREATE HERE 0 0 , DOES> @ ;
: END-STRUCT \ addr n --
SWAP ! ; \ set size
: FIELD \ n1 n2 "name" -- n3 ; addr -- addr+n1
CREATE OVER , + DOES> @ + ;

The name last implementations of STRUCT{ and }STRUCT are trivial
even for native code compilers:

: STRUCT{ \ -- 0
0 ;
: }STRUCT \ size "name" --
CONSTANT ;
: FIELD \ n1 n2 "name" -- n3 ; addr -- addr+n1
CREATE OVER , + DOES> @ + ;

In both cases the definition of FIELD is the same. Further words
that operate on types can all use FIELD and so can be common to
both camps.

For many modern compilers, implementing FIELD to provide efficient
execution requires carnal knowledge of the system.

Solution
--------
The intention of this proposal is to find a portable notation
for defining data structures using word names that do not
conflict with existing systems, and to provide easy portability
of code between systems.

System implementors can then define these words in terms of
their existing structure definition words to enhance performance.

Authors of portable code will have a common point of
reference.

The approach is to define FIELD so as to enable both camps to
co-exist. Since the name last approach is the simplest and only
requires synonyms to define the start and end words, we simply
recommend that these users write portable code in the form:

0
a FIELD b
c FIELD d
CONSTANT somestruct

We now turn to considering name first implementations, with the
objective of providing a simple means of porting to name last
implementations. Using the usual point and rectangle example,
we can define structures:

BEGIN-STRUCTURE point \ -- a-addr 0 ; -- lenp
1 CELLS FIELD p.x \ -- a-addr cell
1 CELLS FIELD p.y \ -- a-addr cell*2
END-STRUCTURE

BEGIN-STRUCTURE rect \ -- a-addr 0 ; -- lenr
point FIELD r.tlhc \ -- a-addr cell*2
point FIELD r.brhc \ -- a-addr cell*4
END-STRUCTURE

The proposal text below does not require FIELD to align any
item. This is deliberate, and allows the construction of groups of
bytes. Because the current size of the structure is available on
the top of the stack, words such as ALIGNED (6.1.0706) can be used.

Compatibility between the name first and name last camps is by
defining the stack item struct-sys, which is implementation
dependent and can be 0 or more cells.

Although this is a sufficient set, most systems provide a means of
defining new types in terms of their size. I couldn't resist trying
to wordsmith a second order defining word!

The recommended types are:
CFIELD 1 character
IFIELD native integer (single cell)
FFIELD native float
SFFIELD 32 bit float
DFFIELD 64 bit float
The following cannot be done until the required addressing has
been defined. The names hould be considered reserved until then.
BFIELD 1 byte (8 bit) field.
WFIELD 16 bit
LFIELD 32 bit field
XFIELD 64 bit field

Name first minimalists are reminded that it is adequate to provide
the source code (even by reference) in this proposal.

Proposal
========
10.6.2.aaaa FIELD
field FACILITY EXT

( struct-sys n1 n2 "<spaces>name" -- struct-sys n3 )
Skip leading space delimiters. Parse name delimited by a space.
Create a definition for name with the execution semantics defined
below. Return struct-sys unchanged and n3=n1+n2. n1 is the offset in
the data structure before FIELD executes, and n2 is the size of
of the data to be added. n1 and n2 are address units.

name Exection:
( addr -- addr+n1 )
Add n1 from the execution of XFIELD above to addr.


10.6.2.bbbb FIELD-TYPE
field-type FACILITY EXT

( n "<spaces>name" -- )
Skip leading space delimiters. Parse name delimited by a space.
Create a definition for name with the execution semantics defined
below. n is the size (in address units) of the field to be created
by execution of name.

name Execution:
( struct-sys n1 "<spaces>name2" -- struct-sys n2 )
Skip leading space delimiters. Parse name delimited by a space.
Create a definition for name2 with the execution semantics defined
below. Return struct-sys unchanged and n2=n1+n. n1 is the offset in
the data structure before name executes, and n2 is the sum of n1
and n from the execution of FIELD-TYPE. n1 and n2 are address
units.

name2 Exection:
( addr -- addr+n )
Add n from the execution of FIELD-TYPE above to addr.


10.6.2.cccc BEGIN-STRUCTURE
struct FACILITY EXT

( "<spaces>name" -- struct-sys 0 )
Skip leading space delimiters. Parse name delimited by a space.
Create a definition for name with the execution semantics defined
below. Return an aligned address that will be used by END-STRUCTURE
(10.6.2.dddd) and an initial offset of 0.

name Execution: ( -- +n )
+n is the size in memory expressed in adress units of the data
structure.


10.6.2.dddd END-STRUCTURE
end-structure FACILITY EXT
( struct-sys +n -- )
Terminate definition of a structure started by BEGIN-STRUCTURE
(10.6.2.cccc).


10.6.2.eeee CFIELD
c-field FACILITY EXT
The execution semantics of CFIELD are identical to the execution
semantics of the phrase
1 CHARS FIELD


10.6.2.ffff IFIELD
i-field FACILITY EXT
The execution semantics of IFIELD are identical to the execution
semantics of the phrase
1 CELLS FIELD


10.6.2.gggg FFIELD
f-field FACILITY EXT
The execution semantics of FFIELD are identical to the execution
semantics of the phrase
1 FLOATS FIELD


10.6.2.hhhh SFFIELD
s-f-field FACILITY EXT
The execution semantics of SFFIELD are identical to the execution
semantics of the phrase
1 SFLOATS FIELD


10.6.2.iiii DFFIELD
d-f-field FACILITY EXT
The execution semantics of DFFIELD are identical to the execution
semantics of the phrase
1 DFLOATS FIELD

Labelling
=========
ENVIRONMENT? impact - table 3.5 in Basis1
name stack condition

THROW/ior impact - table 9.2 in Basis1
value text


Reference Implementation
========================
Modified from VFX Forth for Windows

: begin-structure \ -- addr 0 ; -- size
\ *G Begin definition of a new structure. Use in the form
\ ** *\fo{BEGIN-STRUCTURE <name>}. At run time *\fo{<name>}
\ ** returns the size of the structure.
create
here 0 0 , \ mark stack, lay dummy
does> @ ; \ -- rec-len

: end-structure \ addr n --
\ *G Terminate definition of a structure.
swap ! ; \ set len

: field \ n <"name"> -- ; Exec: addr -- 'addr
\ *G Create a new field within a structure definition of size n bytes.
create
over , +
does>
@ +
;

: field-type \ +n --
\ *G Define a new field type of size +n bytes. Use in the form
\ ** *\fo{<size> FIELD-TYPE <name>}. When *\fo{<name>} executes
\ ** used in the form *\fo{<name> <name2>} a field *\fo{<name2>}
\ ** is created of size n bytes.
create
,
does>
@ field
;

1 chars field-type cfield
1 cells field-type ifield
1 floats field-type ffield
1 sfloats field-type sffield
1 dfloats field-type dffield


Test Cases
==========
--
Stephen Pelc, ***@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Anton Ertl
2006-08-26 12:40:55 UTC
Permalink
Post by Stephen Pelc
I've had another go at structures, making the assumption that the
"name-first" and "name-last" camps can't agree and must coexist
somehow.
I am in the name-last camp, but even though that offers the advantage
of extensibility, this is not such a big issue that I could not live
with a name-first-only proposal. Of course, since you do name-last
for free, that's also ok.
Post by Stephen Pelc
The proposal text below does not require FIELD to align any
item. This is deliberate, and allows the construction of groups of
bytes.
You can do arbitrarily aligned bytes even in a system that supports
alignment, like this:

struct
cell% field foo
char% field bar
byte% 5 * field flip \ <-- group of bytes
float% field flop
...
Post by Stephen Pelc
Because the current size of the structure is available on
the top of the stack, words such as ALIGNED (6.1.0706) can be used.
But will they be used? Chances are that many users will try to write the
equivalent of the above like this:

0
1 cells field foo
1 chars field bar \ if they use chars at all
5 field flip \ assuming au=byte
1 floats field flop \ misaligned!
...

As a consequence, accesses to FLOP will be non-standard; they will be
slow on many machines (without the programmer noticing the reason for
the slowness), and fail on others.
Post by Stephen Pelc
Proposal
========
10.6.2.aaaa FIELD
field FACILITY EXT
( struct-sys n1 n2 "<spaces>name" -- struct-sys n3 )
Unfortunately, this has a conflict with Gforth, which has a word FIELD
with the stack effect ( n1 n2 n3 n4 -- n5 n6 ). Please choose a
different name (but not X_FIELD:-). Does FIELD: have a conflict?
Post by Stephen Pelc
Return struct-sys unchanged
No need to mention struct-sys in the stack effect, then, and no need
for this sentence. Stuff deeper in the stack than covered by the
stack effect is never changed.

Actually, your requirement for struct-sys would prevent the name-last
use of FIELD that you outlined.
Post by Stephen Pelc
10.6.2.cccc BEGIN-STRUCTURE
struct FACILITY EXT
( "<spaces>name" -- struct-sys 0 )
Skip leading space delimiters. Parse name delimited by a space.
Create a definition for name with the execution semantics defined
below. Return an aligned address that will be used by END-STRUCTURE
^^^^^^^^^^^^^^^ struct-sys
Post by Stephen Pelc
(10.6.2.dddd) and an initial offset of 0.
name Execution: ( -- +n )
+n is the size in memory expressed in adress units of the data
structure.
I see a timing problem here: This creates name with its execution
semantics right at the start, but how does name know at the start how
big the structure is going to be? You probably should write that
executing name before its struct-sys was processed by an END-STRUCTURE
to be ambiguous (or something like this).
Post by Stephen Pelc
10.6.2.eeee CFIELD
...

IMO these don't save enough typing to be worth having in the standard,
especially since you don't even include the alignment.
Post by Stephen Pelc
10.6.2.ffff IFIELD
i-field FACILITY EXT
The execution semantics of IFIELD are identical to the execution
semantics of the phrase
1 CELLS FIELD
If we have this, it should at least be equivalent to

ALIGNED 1 CELLS FIELD

and likewise for FFIELD etc.

Fup2 c.l.f.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2006: http://www.complang.tuwien.ac.at/anton/euroforth2006/
Loading...