A5.1 Reduction rules
Boolean expressions in SPAN are manipulated primarily by transposition and by expansion
as described below. To carry out these operations, basic set algebraic rules for
reducing regular Boolean expressions are invoked. For example, using XX
to mean XÇX:
XXW = XW and (XW)ÈX
= X
for any attribute X and non-null set W (which may be a combination
of attributes e.g W = X2X3X4).
Repeated application of these rules is generally sufficient to simplify regular
Boolean expressions generated in a search. They may not, however, simplify very
complex Boolean expressions involving attributes and their complements or attributes
that are somehow nested. These can generally be accomplished with the
Full Boolean reduction option (see A5.4).
A5.2 Transposing partitions
In order that subgroups, as defined in reference 2, are determined, it is necessary
to obtain both A and A- in normal disjunctive form. For example,
consider the p1,...,pq = 3,3,2 partition
A = (X1X2X3)È(X1X3X4)È(X2X5)
Then
A- = (X1-
ÈX2- ÈX3-)Ç(X1-
ÈX3- ÈX4-)Ç(X2-
ÈX5-)
but we need to re-express A- in disjunctive form and to do this it is it
is necessary to "transpose" the above expression by "cross-multiplying" the expression
and carrying out simplifications. If you cross multiply this expression, there will
be 3×3×2 = 18 terms, each with the intersection of three Xi--'s.
For example, the first is X1-X1-X2-, the next X1-X1-X5-, and so on. The first term
simplifies to X1-X2- and later terms such as X1-X3- X2- become absorbed by it. The
18, three element, terms finally simplify to five terms, each with two elements:
A-
= (X1- X2-)È(X1-
X5-)È(X2- X3-)È(X2-
X4-)È(X3- X5-)
This process is fairly simple to computerise and can be made efficient by iterative
"partial transpositions" followed by simplification. Details are omitted.
The process of transposition is done for every partition that is generated by a
search, so that, as mentioned above (A4.1), the Boolean manipulations may slow the
search somewhat when p1,...,pq becomes large.
A5.3 Expanding out partitions
Another operation that is required by SPAN is to expand out partitions in which
there is an added attribute. For example suppose
B = (X1X2)È(X3X4)
is an added attribute and that a partition is generated with
A = (BX3)È(X4)
Then, substituting for B gives
A = (X1X2X3)È(X3X4X3)È(X4) = (X1X2X3)È(X4)
Again, computerising this process is fairly straightforward, with the above reduction
rules for simplifying Boolean expressions.
Note that if an attribute is created from the Y menu option
Transform SPAN does
not attempt to expand out the attribute or perform any simplifications of it. For
example, suppose an attribute is lbw=0
is formed by dichotomising a birth weight variable
BWT at 2.5 kg. Then the partition A =(lbw=0
& BWT <=2.5) is not simplified at all. Once
lbw is formed it becomes a primitive attribute, in the sense
described in 10.2.
A5.4 Full Boolean reductions
One facility of the Option Menu is Full Boolean
reduction (see 17.3). This procedure utilises information about the
data from which attributes are derived. The procedure will only reduce Boolean expressions,
beyond those reductions done automatically by applying the two rules in A5.1, when
attributes in expressions are created from the same underlying variable.
For this type of Boolean reduction, additional reduction rules are invoked which
take account of whether attributes are subsetted or are complements of each other.
For example, the reduction rule
X È(X- W)
= XÈW
is invoked. Also the rule
(X W Z)È(X- Y Z)È(W Y Z V) = (X W Z)È(X- Y
Z )
The option also takes account of attributes formed from different cuts of a variable.
For example, if X1
= { x £ 1 } and X2 = { x
£ 2 } are attributes derived from x
by cutpoints 1 and 2, then X1 is a subset of X2
and X1ÇX2
= X1.
Multiple attributes derived from nominal variables are more difficult to deal with
since the intersection or union of attributes so formed depends on how the attributes
are defined. Consider, for example, a nominal variable x with possible categories
1,2,3 and 4. If you include the control file attribute creating lines:
X n1234 12
X n1234 34
you will effectively create attributes X1 = { x = 1,2 } and X2 = { x = 3,4 } . Here X1X2
is null, whatever the state space of x, while X1ÈX2 is the universal
set since the state space is defined by 1234. If the universe 1234 is omitted, SPAN
assumes it to be the set of integers 0123456789 (see 9.3.1) in which case X1ÈX2 is not the universal
set.
A5.5 Null partitions
As mentioned in 9.9, a null attribute
is automatically created by SPAN. Its purpose is to recognise Boolean expressions
that are null. For example, referring to the above, the expression A =
X1ÇX2 is null, the empty set,
and would be represented in SPAN by A=(.=null)
with complement A-=(^=null).
[Back to table of contents]