The FLIP Group: Ferri-Ramírez,
Cèsar;
Hernandez-Orallo,
José;
Ramirez-Quintana,
M.José
a subgroup of Extensions of Logic Programming Group.
The following definitions are just some suggestions that highlight the
utility of current RuleML definition in some unsuspected ways and that
may trigger discussion about future incorporations to the RuleML standard.
In no way they constitute an alternative proposal to RuleML or part of
any other project.
This page just includes some rough ideas iteratively refined from an
email correspondence between Harold
Boley and us.
However, we need the notion of fact and a definition of a set of facts
to act as an evidence, i.e., factual rulebases.
First, we essayed the definition of an element 'fact', in an obvious
way:
<!ELEMENT fact (%conc;)>but later we required that facts could include 'ands' and we redefined it in the following way:
<!ELEMENT fact (%prem;)>
<!ATTLIST fact
label ID #IMPLIED
instantation (ground | nonground) #IMPLIED
assertion (negative) #IMPLIED
>We have included the optional attributes instantiation and assertion to distinguish between ground and non-ground facts and between positive and negative.
From here, it's easy to define new elements for positive evidence, negative evidence and both positive and negative evidence:
<!ELEMENT factbase (fact*)>
<!ELEMENT neg-factbase (fact*)>
<!ELEMENT evidence (factbase, neg-factbase?)>
<!ATTLIST evidence
label ID #IMPLIED
>
<!ELEMENT knowledgebase (rulebase | evidence | factbase)*>We have now redefined labels as ID instead of CDATA. The use of labels (as ID) is justified by the intention of making reference from a set of rules to an evidence, in order to express metaknowledge, as will be discussed below.
<if label="rule1">
<eq>
<nano>
<fun>fac</fun>
<ind>0</ind>
</nano>
<ind>1</ind>
</eq>
<and/>
</if>An example of evidence is done in a similar way:
<evidence label="fact4examples">
<!-- an example of evidence -->
<factbase>
<fact>
<eq>
<nano>
<fun>fac</fun>
<ind>0</ind>
</nano>
<ind>1</ind>
</eq>
</fact>
...This makes it possible for expressing metaknowledge, but using equalog rules instead of constructing new definitions for each kind of metainformation, such as PMML does:
<rulebase>
<!-- an example of metaknowledge about a set of rules and an evidence -->
<if>
<eq>
<nano>
<fun>accuracy</fun>
<ur> factorial.ruleml </ur>
<ur> factorial-evidence.ruleml </ur>
</nano>
<ind>0.5</ind>
</eq>
<and/>
</if>
</rulebase>which states that the accuracy of the program found in the URL "factorial.ruleml" wrt. the evidence found in the URL "factorial-evidence.ruleml" is 0.5.
We do not know whether it could be more convenient to use 'ids' and
'idrefs' instead of URLs (maybe using XPointer/XLink), in order to be able
to make reference to a specifically identified rule in a RuleML document.
The problem in this case would be a type mismatch, because 'evidences'
and 'rulesets' cannot be used as expressions inside a <nano>.
Please find our "ruleml-urinduction-standalone.dtd",
the rulebase "factorial.ruleml" (slightly
modified from RuleML webpage), an evidence "factorial-evidence.ruleml"
and a piece of metaknowledge "factorial-accuracy.ruleml"
including a fact about the accuracy of the rulebase wrt. the evidence.
The first and easy thing to do is to express the evidence for a classical decision tree, e.g. Quinlan's playtennis example. The evidence is expressed using RuleML in the file "playtennis-evidence.ruleml":
To express the tree, which goes like this in a forward notation:
SKY=overcast (4, 4yes, 0no) then yes (acc=1.0)
SKY=rain (5, 3yes, 2no)
WIND=weak (3, 3yes,
0no) then yes (acc=1.0)
WIND=strong (2, 0yes,
2no) then no (acc=1.0)
SKY=sunny (5, 2yes, 3no)
HUMIDITY=normal (2,
2yes, 0no) then yes (acc=1.0)
HUMIDITY=high (3,
0yes, 3no) then no (acc=1.0)
we have implemented four versions, with different degrees of metaknowledge
wrt. the evidence "playtennis-evidence.ruleml".
The first one "playtennis-tree1.ruleml"
just splits the tree in different and independent rules, the rules are
inconditional, and we are able to express the accuracy of the rules (leaves
of the tree) and their support.
The second one "playtennis-tree2.ruleml"
is much like the first one, but rules are conditional.
The third one "playtennis-tree3.ruleml"
is the one which motivates the changes that appear in the DTD. It separates
the conditions (premises) from the conclusions, in order to be able to
express metaknowledge about the internal nodes of the tree. It also makes
an extensive use of ids and idrefs in order to express conditions just
once. For instance, in a decision tree it's usual to have rules of the
form "f=true:- q, r, s" , "f=false:-q, r ,t" that can be expressed more
concisely by using
references.
Another option is the use of rules inside rules, but this can be done
more or less with the same result as when creating an auxiliary predicate:
i.e., defining "u :- q, r" along with "f=true:- u, s", "f=false:-u, t".
This is which yields our fourth version "playtennis-tree4.ruleml",
which is the one which requires less modifications to the RuleML 0.7 DTD.
We also thought about expressing the tree as a single rule, using 'or'
as well as 'and', but then we had the problem of expressing knowledge
about parts of a rule. Consequently, we forgot it. However, we think
that 'or' could be useful for some expressions. At least it has less problems
than negation.