In order to have understanding intuitively know
concerning the taxonomic model, it keeps explaining making use of the
chart under. Did this chart, when certain person, being some
kind of meteorological condition, "do golf?", or "wasn't golf done?",
it is the data concerning. When the chart the left, doing golf,
with the meteorological data, the case of 9 cases is shown. When
the middle chart, doing golf, the case of 5 cases is shown with the
meteorological data. Well, it is problem here. With the
past case which is shown in these two charts as a reference, weather
"it clears up" and, air temperature "60", humidity "90", as for the
wind "is not" with the tomorrow which is forecast, as for this person
does golf, probably will be?
This kind of problem during our daily lives, it is the problem
which is discovered well. Usually, the human obtained from past
experience, "what?" it judges on the basis of. The what it is
difficult completely to state clearly to convert, but that "what?"
those which it states clearly converts in the form which is limited
classification (estimate) are the model. Obscurity) concerning
the data being to do judgement whether from the case of the past 14
cases by the item which with example of golf, you call four
meteorological condition and is limited, the oak which does golf it
makes the taxonomic model which how regards, unknown (there is no oak
which does golf on the basis of that model, it does.
Quotation) J.R. The data analysis '
ÇÄÇ¿ÇÍÇï,1995, p.18, figure by Ç©ÇïÇåÇï ' AI 2-1 revision
correction
Well, the model is drawn up from the case of the
above-mentioned 14 cases preceding, this person does golf at the time
of some kind of meteorological condition, (speaking conversely, it
will keep thinking it does not designate golf as the time of some kind
of meteorological condition) of? Please find the rule. You
think as everyone becoming aware, but weather "to become cloudy", as
for the case when doing golf only, you cannot observe that first it
catches to the eyes. With as for the notion that where you say,
if weather is cloudiness, as for this person that presumption is
attached whether it is not to be a tendency which does golf.
Then will day "of rain" how probably be? When you observe
to the case of day of rain, it does not do golf, very it does golf
that weather is the rain it does not do it is relationship improbable.
Observing to "the wind" in the case of day of rain, when you
see, when doing golf, when everything (3 cases) there is no wind,
doing golf everything (2 cases) there is a wind then. With as
for the notion that where you say, very there is "a wind" concerning
day of rain, without, it seems that produces effect on play of golf.
At description above, "weather" observing when
becoming cloudy and the rain, two rules were found. But,
concerning other "air temperatures" and "humidity" and "the wind" how
probably will be? For example when air temperature is high, when
being low, how probably will be? And air temperature is high,
saying, that it is low, something depending upon whether you divide at
degree, the various viewpoints come emerging. Furthermore
because the taxonomic model is formed, rule gathering which can
explain all cases must be found. Weather as for rule concerning
the case of cloudiness however you found, rule of the case of clearing
up without being understood, is problem.
Well, because with example of the latest golf, furthermore it is
the very such as the case of 14 small data with four items, the human
does in the manual operation and others sees and perhaps, it keeps
inspecting in crushing possibility, but it becomes the work where the
bone breaks very, probably will be. Please try imagining the big
data such as 100000 cases with 100 items. Don't you think? you
do not obtain anymore in human interlude. In order to form the
taxonomic model, the enormous search space must be designated as the
partner, you think that it makes understand that. But, this
making the computer calculate, in the same way is very difficult
problem. Efficiently is the technique which forms the accurate
taxonomic model is mainly proposed then heuristic (heuristic) making
use of technique.
As one of that kind of technique, the taxonomic
model "the decision tree (Decision Tree)" with there is a technique which is expressed with the
type which is called. Example of the decision tree in the
example of the above-mentioned golf has been shown in the rough
sketch, (how doing, concerning whether this kind of decision tree is
drawn up, it mentions later). When it rises and falls is
reversed this figure, being to become the shape like the wood it is
called the "decision tree". The decision tree appears in the
woman magazine well, due to "yes", [ please think the thing like the
character diagnosis no ". In the figure, question is shown in
the square framework of white, answers to that question with "yes," or
"no" keeps tracing the branch which is shown with the arrow. If
it arrives in the square framework of the color being attached, that
has become answering. As for the number inside the parenthesis,
the data number of cases which agrees to the condition is shown.
Xtclassify command forms such taxonomic model automatically.
Once, if it can form this kind of model, first problem (weather
= when "it clears up and", air temperature = "60", humidity = "90",
wind = "it is not" being, there is no oak which does golf?) Also
the answering for is required simply. First, it is question of
the top node of the decision tree, "weather = cloudiness? When
"it fits, answering" clears up "" no "(weather = and) it is to be and
the branch the left is traced. As for the following question
"weather = rain? "Is, but this is" no ". And the branch
the left is traced, the following "humidity <=75? "Concerning
question because" yes "is, the branch the right is traced. And
finally "golf is done", it arrives, tomorrow, as for this person "it
does golf", that it is the case that it can be estimated.
Here, it explains concerning the basic data which
is used in decisive tree forming. With xtclassify, it is
necessary to prepare the kind of data which is shown in the rough
sketch. As data elements,resultattribute itemand
explanatoryattribute item must be
included.
If result attribute, you refer to example of golf, with thing of
the attribute which is displayed whether the oak which does golf it
was not, you must prepare by all means as one item. Xtclassify,
value of result attribute can be handled to 10 types, (with example of
golf, "golf it does", "golf it does not do", they were two types don't
you think?).
Explanatory attribute, when you refer to example of golf, is
attribute such as weather and humidity. As for explanatory
attribute, in xtclassify command it is possible to maximum of 256
items to handle. And, as type of value of explanatory attribute,
with xtclassify,it is possibleto handlethe typeof
threetypes of numerical type,
category type and pattern type, please refer
to the chapter of BONSAI (concerning the treatment of pattern type).
With example of golf, as for air temperature and humidity with
numerical type, as for weather and the wind it is category type.
Furthermore with xtclassify, unclear value (NULL
value) it is possibleasa
value of explanatory attribute, to include, (as for details please
refer to the chapter of treatment of NULL value).
<? Xml version= "1.0" encoding= "euc-jp"? > <xmltbl version= "1.1" > <header> <field no= "1" name= "weather" ></field> <field no= "2" name= "temperature" ></field> <field no= "3" name= "humidity" ></field> <field no= "4" name= "wind" ></field> <field no= "5" name= "golf" ></field> </header> <body><! [ CDATA [ clear 85 85 nothing the clear 80 90 possessions which are not done clouding 83 78 nothing which are not done rainy 70 96 nothing which are done rainy 68 80 nothing which are done the rainy 65 70 possessions which are done the clouding 64 65 possessions which are not done clear 72 95 nothing which are done clear 69 70 nothing which are not done rainy 75 80 nothing which are done the clear 75 70 possessions which are done the clouding 72 90 possessions which are done clouding 81 75 nothing which are done the rainy 71 80 possessions which are done it did not do ] ] ></body> </xmltbl> |