xtcommon

Section: User Commands (1)
Updated: 2002-10-26
Index Return to Main Contents

 

NAME

xtcommon - select common records in reference file

 

SYNOPSIS

xtcommon -k key attribute(s) from input -m reference file name [-K key attribute(s) from reference file] [-u EXCEPTION OUTPUT] [-r] [-H hashing is used for selection] [-i INPUT] [-o OUTPUT] [-z] [-t] [-T TEMP DIRECTORY]

 

DESCRIPTION

Compare and select common key attributes specified by -k in the input file with records in the reference file that matches the key attributes specified by -K when the name of key attribute is different. When the option -r is used, attribute selection will be inverted, which in turn selects attributes which does not match the ones in the reference file. The option -u can be used to discharge records that are not selected.

 

PARAMETERS

-k key attribute(s)
the unit where counting will be based on.
-K key attributes in reference file
define the name of key attributes in the reference file when the attribute name is different than the key attribute in -k .
-m reference file name
reference file name
-u output filename for data excluded
allows records secluded from the selection to be saved in the file assigned by option -u.
-H selection with hashing
improves data processing performance if the input file is large and the reference file is small.
-r
reverse the selection for records excluded from the selection.

 

FILE OPTIONS

-i input filename
if a suffix of the filename is '.gz', the command acts as a filter, extracting the compressed file for processing. The command will read the file as standard input when "-i" is not specified.
-o output filename
if a suffix of the filename is '.gz', the command automatically returns the output data in zip archive. When "-o" is not specified, the result will sent to standard output.
-T working directory
specify the directory name for temporal files used in this command.
-z zip archive
compress the standard output to zip archive. When the option "-o" is not given and "-z" is specified, the output will be compressed as zip archive.
-t plain text
xtagg treats the input and output data as plain text format.

 

USAGE

Input file - dat.xt:
<field no="1">
<name>Customer</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalAmount</name> </field>
<field no="4">
<name>TotalQuantity</name> </field>
</header>
<body><![CDATA[
A00004 20020214 1 365
A00004 20020415 9 4349
A00004 20020625 13 5268
A00004 20020810 5 1805
A00004 20021014 2 612
A00004 20021016 11 3410
A00005 20020918 12 4554
A00005 20020923 1 491
A00056 20021128 1 94
A00056 20021128 1 112
A00056 20021128 1 115
A00056 20021128 1 93
A00056 20021128 1 149
A00120 20020727 1 85
A00120 20020727 1 68
A00120 20020727 1 112
A00120 20020727 1 69
A00131 20020108 2 280
]]></body>

Input file -dat.xt
<field no="1">
<name>CustomerID</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalQuantity</name>
</field>
<field no="4">
<name>TotalAmount</name>
</field>
</header>
<body><![CDATA[
A00004 20020214 1 365
A00004 20020415 9 4349
A00004 20020625 13 5268
A00004 20020810 5 1805
A00004 20021014 2 612
A00004 20021016 11 3410
A00005 20020918 12 4554
A00005 20020923 1 491
A00006 20020606 3 1364
A00006 20020918 5 2195
]]></body>

Example 1. Select the records in dat1.xt with common key attributes in dat2.xt e.g. xtcommon -k customerID -m ref.xt -i dat.xt -o rsl.xt

Output file -rsl.xt

<body><![CDATA[
A00004 20020214 1 365
A00004 20020415 9 4349
A00004 20020625 13 5268
A00004 20020810 5 1805
A00004 20021014 2 612
A00004 20021016 11 3410
A00005 20020918 12 4554
A00005 20020923 1 491
A00006 20020606 3 1364
A00006 20020918 5 2195
]]></body>

Example 2. Select the records with common key which exist in dat.xt with different attribute name. e.g. xtcommon -k customerID -K customer -m ref.xt -i dat.xt -o rsl.xt Input file -ref.xt

<field no="1">
<name>Customer</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalQuantity</name>
</field>
<field no="4">
<name>TotalAmount</name>
</field>
</header>
<body><![CDATA[
A00004 20020214 1 365
A00004 20020415 9 4349
A00004 20020625 13 5268
A00004 20020810 5 1805
A00004 20021014 2 612
A00004 20021016 11 3410
A00005 20020918 12 4554
A00005 20020923 1 491
]]></body>

Input file -dat2.xt
same as above Output file -rsl.xt

<field no="1">
<name>CustomerID</name>
</field>
<field no="2">
<name>Date</name>
</field>
<field no="3">
<name>TotalAmount</name>
</field>
<field no="4">
<name>TotalQuantity</name>
</field>
</header>
<body><![CDATA[
A00004 20020214 1 365
A00004 20020415 9 4349
A00004 20020625 13 5268
A00004 20020810 5 1805
A00004 20021014 2 612
A00004 20021016 11 3410
A00005 20020918 12 4554
A00005 20020923 1 491
]]></body>

 

DIAGNOSTICS

The number of attributes in the reference file must match with the input file.

 

SEE ALSO

xtselstr(1), xtsel(1), xtjoin(1), xtnjoin(1) For complete documentation and tutorial of xtcommon and other commands, please visit http://musashien.sourceforge.net

 

BUG REPORT

If you find a bug in xtcommon, please send an electronic mail to musashi@adm.osaka-sandai.ac.jp. Before sending a bug report, please verify that you have the lastest version of MUSASHI. Read this manual carefully to ensure the error is not caused by a quirk in the l anguage.

 

AUTHORS

Yukinobu Hamuro, Naoki Katoh, Katsutoshi Yada, Stephane Cheung


 

Index

NAME
SYNOPSIS
DESCRIPTION
PARAMETERS
FILE OPTIONS
USAGE
DIAGNOSTICS
SEE ALSO
BUG REPORT
AUTHORS

This document was created by man2html, using the manual pages.
Time: 22:43:53 GMT, June 24, 2003