xtstatistics
Section: User Commands (1)
Updated: 2002-10-26
Index
Return to Main Contents
NAME
xtstatistics - statistical calculation
SYNOPSIS
xtstatistics -f attribute(s) -c statistics{var|std}
[-k key attribute] [-q] [-i INPUT] [-o OUTPUT] [-z] [-t] [-T TEMP DIRECTORY]
DESCRIPTION
This command computes several statistics for the attribute(s) specified by -f with records having the same key value and sorts the output in ascending order. If -k is not specified, all records will have the same key.
There are two statistical functions: variance and standard deviation, specified by the parameter -c.
The output is stored as a replace value for the attribute -f by default. Yet, the output can be stored in as a new attribute as '-f attribute name:new attribute name'.
PARAMETERS
- -k key attribute
-
key attribute which gives a unit for statistical computation (if omitted, all lines are assumed to take the same key value.)
- -f a list of attributes
-
a list of attributes for which statistical value is computed.
- -c statistical function
-
the two functions available are: variance and standard deviation.
- -q sequential processing
-
when this option is used with the -k parameter, the command processes the input
data in original sequence of the records, instead of sorting by the key attribute -k.
OPTIONS
- -i input filename
-
if a suffix of the filename is '.gz', the command acts as a filter, extracting t
he compressed file for processing. The command will read the file as standard in
put when "-i" is not specified.
- -o output filename
-
if a suffix of the filename is '.gz', the command automatically returns the outp
ut data in zip archive. When "-o" is not specified, the result will sent to stan
dard output.
- -T temp directory
-
specify the directory name for temporal files used in this command.
- -z zip archive
-
compress the standard output to zip archive. When the option "-o" is not given a
nd "-z" is specified, the output will be compressed as zip archive.
- -t plain text
-
treats the input and output data as plain text for
USAGE
Input file - dat.xt:
<field no="1">
<name>CustomerID</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalQuantity</name>
</field>
<field no="4">
<name>TotalAmount</name>
</field>
<body><![CDATA[
A00001 20020211 1 400
A00004 20020214 1 365
A00004 20020415 5 4349
A00004 20020625 3 5268
A00004 20020810 2 1805
A00004 20021014 2 612
A00005 20020918 12 4554
A00005 20020923 1 491
A00006 20020606 3 1364
A00006 20020918 5 2195
]]></body>
Example 1. Count the number of customers by date.
e.g.xtstatistics -k CustomerID -f TotalQuantity,TotalAmount -c var -i dat2.xt -o rsl.xt
Output file -rsl.xt
-
<body><![CDATA[
A00001 20020211 * *
A00004 20021014 2.3 4921094.7
A00005 20020923 60.5 8253984.5
A00006 20020918 2 345280.5
]]></body>
Note: A minimum of two records are required to calculate variation. If there is only one record per each key, the output will be shown as *.
DIAGNOSTICS
Note that if an asterisk appears in any one of the records on the selected attribute, the command will return an asterisk. Use xtdelnul to remove records with null value.
SEE ALSO
xtagg(1)
For complete documentation and tutorial of xtstatistics and other commands, please visit
http://musashien.sourceforge.net
BUG REPORT
If you find a bug in xtstatistics, please send an electronic mail to
musashi@adm.osaka-sandai.ac.jp.
Before sending a bug report, please verify that you have the lastest version of
MUSASHI.
Read this manual carefully to ensure the error is not caused by a quirk in the language.
AUTHORS
Yukinobu Hamuro, Naoki Katoh, Katsutoshi Yada, Stephane Cheung
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- PARAMETERS
-
- OPTIONS
-
- USAGE
-
- DIAGNOSTICS
-
- SEE ALSO
-
- BUG REPORT
-
- AUTHORS
-
This document was created by
man2html,
using the manual pages.
Time: 22:43:56 GMT, June 24, 2003