xtstatistics

Section: User Commands (1)
Updated: 2002-10-26
Index Return to Main Contents

 

NAME

xtstatistics - statistical calculation

 

SYNOPSIS

xtstatistics -f attribute(s) -c statistics{var|std} [-k key attribute] [-q] [-i INPUT] [-o OUTPUT] [-z] [-t] [-T TEMP DIRECTORY]

 

DESCRIPTION

This command computes several statistics for the attribute(s) specified by -f with records having the same key value and sorts the output in ascending order. If -k is not specified, all records will have the same key. There are two statistical functions: variance and standard deviation, specified by the parameter -c. The output is stored as a replace value for the attribute -f by default. Yet, the output can be stored in as a new attribute as '-f attribute name:new attribute name'.

 

PARAMETERS

-k key attribute
key attribute which gives a unit for statistical computation (if omitted, all lines are assumed to take the same key value.)
-f a list of attributes
a list of attributes for which statistical value is computed.
-c statistical function
the two functions available are: variance and standard deviation.
-q sequential processing
when this option is used with the -k parameter, the command processes the input data in original sequence of the records, instead of sorting by the key attribute -k.

 

OPTIONS

-i input filename
if a suffix of the filename is '.gz', the command acts as a filter, extracting t he compressed file for processing. The command will read the file as standard in put when "-i" is not specified.
-o output filename
if a suffix of the filename is '.gz', the command automatically returns the outp ut data in zip archive. When "-o" is not specified, the result will sent to stan dard output.
-T temp directory
specify the directory name for temporal files used in this command.
-z zip archive
compress the standard output to zip archive. When the option "-o" is not given a nd "-z" is specified, the output will be compressed as zip archive.
-t plain text
treats the input and output data as plain text for

 

USAGE

Input file - dat.xt:
<field no="1">
<name>CustomerID</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalQuantity</name>
</field>
<field no="4">
<name>TotalAmount</name>
</field>
<body><![CDATA[
A00001 20020211 1 400
A00004 20020214 1 365
A00004 20020415 5 4349
A00004 20020625 3 5268
A00004 20020810 2 1805
A00004 20021014 2 612
A00005 20020918 12 4554
A00005 20020923 1 491
A00006 20020606 3 1364
A00006 20020918 5 2195
]]></body>

Example 1. Count the number of customers by date. e.g.xtstatistics -k CustomerID -f TotalQuantity,TotalAmount -c var -i dat2.xt -o rsl.xt Output file -rsl.xt

<body><![CDATA[
A00001 20020211 * *
A00004 20021014 2.3 4921094.7
A00005 20020923 60.5 8253984.5
A00006 20020918 2 345280.5
]]></body> Note: A minimum of two records are required to calculate variation. If there is only one record per each key, the output will be shown as *.

 

DIAGNOSTICS

Note that if an asterisk appears in any one of the records on the selected attribute, the command will return an asterisk. Use xtdelnul to remove records with null value.

 

SEE ALSO

xtagg(1) For complete documentation and tutorial of xtstatistics and other commands, please visit http://musashien.sourceforge.net

 

BUG REPORT

If you find a bug in xtstatistics, please send an electronic mail to musashi@adm.osaka-sandai.ac.jp. Before sending a bug report, please verify that you have the lastest version of MUSASHI. Read this manual carefully to ensure the error is not caused by a quirk in the language.

 

AUTHORS

Yukinobu Hamuro, Naoki Katoh, Katsutoshi Yada, Stephane Cheung


 

Index

NAME
SYNOPSIS
DESCRIPTION
PARAMETERS
OPTIONS
USAGE
DIAGNOSTICS
SEE ALSO
BUG REPORT
AUTHORS

This document was created by man2html, using the manual pages.
Time: 22:43:56 GMT, June 24, 2003