Basic Commands
Lesson 3: Aggregration (xtcount) Part II

xtcount is another aggregation command. It counts the number of records (lines) within the same key value. The basis of counting is defined by the user in the key argument. Some common usage includes couting the number of hits for a specific category, products, date, etc.

Summary of options and usage: xtcount


Using xtcount

Editing your script

In FD console, make a copy from xtcut.sh and rename it as xtcount.sh by pressing "c+CTRL".

copy the file xtcut.sh as a new name on the current directory
new name : xtcount.sh
cp /home/public/lesson/basic/xtcut.sh /home/bear/lesson/basic/xta gg.sh

Goal: Count the number of transactions for each unique date.

Methodology: Select the attribute "Date" with xtcut command, then redirect the output to xtcount command with "|" which returns the number of transcation records for each unique date. Pipe the output to xtheader for set a title and comment of the resulting data set.

Specify the parameters as follows:

Key - -k Date
Note: DATE will be the key attribute where counting will be based on.

New Count Result - -a num_transcations
Note: The argument -a defines the name of the new column where the counting results will be stored. The value of the new attribute "num_transcations" hereby stands for the number of times products are scanned at point of sales registers for each day.

The script will look as follows:

#/bin/bash
#===============================================================
# MUSASHI bash script
#===============================================================

#---- Title
title="Tutorial"

#---- Comment
comment="xtcount"

#---- variables
inPath="/home/public/tutorial"

#---------------------------------------------------------------
# commands
#---------------------------------------------------------------
xtcut -f Date -i $inPath/dat.xt |
xtcount -k Date -a num_transcations |
xtheader -l "$title" -c "$comment" -o xtcount.xt
#===============================================================

After editing, save and execute the script. Your result should look as follows:
<?xml version="1.0" encoding="euc-jp"?>
<xmltbl version="1.00">
<header>
<title>
Tutorial </title>
<comment>
xtcount
</comment>
<field no="1">
<name>Datet</name>
<sort priority="1">
</sort>
</field>
<field no="2">
<name>Count</name>
</field>
</header>
<body><![CDATA[
20020101 105
20020102 32
20020103 116
20020104 78
20020105 43
20020106 76
20020107 67
20020108 69
20020109 113
20020110 82
20020111 116
20020112 132
20020113 85
20020114 56
20020115 97
20020116 110
20020117 102
20020118 100
20020119 105


Redefining Attributes

The counting operation does not necessarily has to be based on key attributes, xtcount can also count the number of records in the whole dataset by removing the "-f" argument.
#/bin/bash
#===============================================================
# MUSASHI bash script
#===============================================================

#---- Title
title="Tutorial"

#---- Comment
comment="xtcount"

#---- variables
inPath="/home/public/tutorial"

#---------------------------------------------------------------
# Commands
#---------------------------------------------------------------
xtcut -f Date -i $inPath/dat.xt |
xtcount -a num_transactions |
xtheader -l "$title" -c "$comment" -o xtcount.xt
#===============================================================

The following results shows there are a total of 38733 scanned transcations occured in the year of 2002.

<?xml version="1.0" encoding="euc-jp"?>
<xmltbl version="1.00">
<header>
<title>
Tutorial </title>
<comment>
xtcount
</comment>
<field no="1">
<name>Date</name>
</field>
<field no="2">
<name>Count</name>
</field>
</header>
<body><![CDATA[
20021231 38733
]]></body>
</xmltbl>

One Point: How is aggregation performed on newly created attributes?
In the above tutorial, a new attribute "num_transactions" is created to dervie the count for "Date" attribute. When aggregration is performed with "xtagg", a new column will not be created, instead, the attributes defined in the fields arguments will show the Total in the output. For the two aggregration commands, The last key attribute value arranged in sorted order corresponds to the count result. Therefore, you will notice the Date value to count all transactions is "20021231".

One Point: The use of key attributes in exceptional cases.
Key attributes can be specififed by -k as an argument of the command. In our second example -k is omitted, xtcount executes by assuming that all lines have the same key value. Similary, xtcal command assume that all line have different key values when -k is omitted.
MUSASHI has an option to specify whether the values on all lines are the same or different by the parameter "-k#same#" or "-k#diff#". In most cases, this definition is meanless in most commands. For example, "xtcount -k #diff#i -a count" will return "1" for every line. As "#same" and "#diff#" are reserved words, you cannot use them as an attribute name.


Exercises

Let's practice xtcount on the reports below. Check your results with the scripts and output files given below.

Report name Script name Output file (xt) Output file (html)
Number of transactions per manufacturer xtcount1.sh xtcount1.xt xtcount1.html
Number of transactions per brand xtcount2.sh xtcount2.xt xtcount2.html
number of transactions per 1-digit classification code xtcount3.sh xtcount3.xt xtcount3.html
number of transactions per 2-digit classification code xtcount4.sh xtcount4.xt xtcount4.html
number of transactions per 4-digit classification code xtcount5.sh xtcount5.xt xtcount5.html
number of transactions per 2-digit classification code for each manufacturer xtcount6.sh xtcount6.xt xtcount6.html
< href="../index.html">Home  |  Next> Lesson 4: Sort(xtsort)

/body>