Basic Commands
Lesson 5: Extracting substrings (xtsubstr)

The xtsubstr command allows user to pick out strings from the attribute specified. It returns the number of characters in a string or the number of bytes required to store a variable. Some common usage includes extracting year/date from the Date attribute containing year/date/day information.

Summary of options and usage: xtsubstr


Using xtsubstr

Goal: Create a dataset containing total quantity and amount information for each month in year 2002.

Step 1: Editing your script with FD

In FD console, make a copy from xtagg.sh and rename it as xtsubstr.sh by pressing "c+CTRL".

copy the file xtcut.sh as a new name on the current directory
new name : xtsubstr.sh
cp /home/public/lesson/basic/xtsubstr.sh /home/public/lesson/basic/xtsubstr.sh

Methodology: Select the attribute "Date", quantity, and amount with xtcut command, then redirect the output to xtsubstr. In the sample data set, the date is represented in the format of YYYMMDD. Define the appropriate parameter shown below to extract the first 6 characters of the date, then rename the attribute as year_month. Passed the output to xtagg with "|" to compute the total quantity and amount for each year/month. Finally, pipe the output to xtheader and rename the output as "xtsubstr.xt".

Step 2: Defining Attributes and Options

Specify the parameters as follows:

Key - -f Date:year_month
Note: DATE is the field where the string will be distilled to the defined range in the argument. You may also rename the attribute at the "-k" parameter and rename with "Date:year_month" as your argument.

Range - -R 1_6
Note: The argument -R defines the range of the attribute to be extracted from the starting position to the ending position. The original attribute "Date" will then be replaced by the new format.

Create new string format as new attribute --A year:month
Note: Instead of replacing the old string with new string, you may create the new string format as a new column append to the dataset with "-A" parameter.

The script will look as follows:

#/bin/bash
#===============================================================
# MUSASHI bash script
#===============================================================

#---- Title
title="Tutorial"

#---- Comment
comment="xtsubstr"

#---- variables
inPath="/home/public/tutorial"

#---------------------------------------------------------------
# commands
#---------------------------------------------------------------
xtcut -f Date -i $inPath/dat.xt |
xtsubstr -f Date:year_month -R 1_6 |

xtagg -k year_month -f Quantity:TotalQuantity,Amount:TotalAmount -c sum | xtheader -l "$title" -c "$comment" -o xtsubstr.xt
#===============================================================

Step 4. Running the script

After editing, save and execute the script. Your result should look as follows:
<?xml version="1.0" encoding="euc-jp"?>
<xmltbl version="1.00">
<header>
<title>
Tutorial </title>
<comment>
xtcount
</comment>
<field no="1">
<name>year_month</name>
<sort priority="1">
</sort>
</field>
<field no="2">
<name>TotalQuantity</name>
</field>
<field no="3">
<name>TotalAmount</name>
</field>
</header>
<body><![CDATA[
20020101 105
20020102 32
20020103 116
20020104 78
20020105 43
20020106 76
20020107 67
20020108 69
20020109 113
20020110 82
20020111 116
20020112 132
20020113 85
20020114 56
20020115 97
20020116 110
20020117 102
20020118 100
20020119 105


Exercises

Let's practice xtsubstr some sample reports. Check your results with the scripts and output files given below.

Report name Script name Output file (xt) Output file (html)
Total quantity and amount for each month xtsubstr1.sh xtsubstr1.xt xtsubstr1.html
Total quantity and amount for each day xtsubstr2.sh xtsubstr2.xt xtsubstr2.html
Total quantity and amount for each hour xtcount3.sh xtcount3.xt xtcount3.html

Home  |  Next> Lesson 6: Record Selection I