Basic Reports: Customer's date of first visit, last visit, and the duration

In this lesson, we will create a report tracking the date of customer's first visit to the store, last date of visit to the store, and the duration in between.

The Output

Customer's date of first visit, last visit, and the duration

Tutorial

Customer First date of visit Last date of visit/TH> Duration
A00001 20020826 20021221 117
A00002 20020112 20021225 347
A00003 20020727 20021008 73
A00004 20020214 20021016 244
A00005 20020918 20020923 5
A00006 20020606 20020918 104
A00007 20020208 20021229 324
A00008 20020101 20020424 113
A00009 20020304 20021215 286
A00010 20020402 20021219 261
A00011 20020609 20020609 0
A00012 20020131 20021202 305
A00014 20020215 20020830 196
A00015 20020629 20020927 90
A00016 20020629 20021217 171
A00017 20020107 20020923 259
A00018 20020302 20020903 185
A00019 20020115 20020910 238
A00020 20020512 20021221 223
 :
 :

The process flow

The first visit date is where the customer first came to the store in the data, and the last visit date is the lastest date that he / she visited in the data. The duration is therefore the difference between the first visit date and the last visit date.

Although there is only one year data in this data set, we can still observe cases of new customers starting their purchase, and stopped coming to the store at some point. Thereby, we will be able to tell the first and last visit date to the store, and identify customers with short visits.

Most commands to create the report is covered in Basic commands. Below is the process flow to generate the report, think through each step carefully before proceeding.

Answer: The process flow

When creating more complicated scripts, it is more efficient and clear to create each section separately and join the results at the end.

Script

After we have prepared the work flow, let's rethink each step and start creating the script.

#/bin/bash
#===============================================================
# MUSASHI bash script
#===============================================================

#---- Title
title="Customer's first visit, last visit, and duration

#---- Comment
comment="Basic Reports"

#---- Variables
inPath="/home/public/tutorial"

#---------------------------------------------------------------
# Command
#---------------------------------------------------------------
#Customer's first visit to the store
xtcat -i $inPath/dat.xt |
xtdelnul -f customer |
xtcut -f customer,date |
xtagg -k customer -f date:firstvisit -c min -o xxfirst

#Customer's last visit to the store
xtcat -i $inPath/dat.xt |
xtdelnul -f customer |
xtcut -f customer,date |
xtagg -k customer -f date:lastvisit -c max -o xxlast

#Total duration of visit
xtjoin -k customer -m xxlast -f lastvisit -i xxfirst |
xtcal -c 'day($lastvisit,$firstvisit)' -a duration |
xtheader -l "$title" -c "$comment" -o visitdate.xt

rm xx*
#===============================================================

Check your results to make sure the commands are properly executed.

Exercises

The following are additional exercises for this lesson, try them out and check your results with the scripts and output given below.

Report Name Script Name Result (xt) Result (html)
The number of new customers to the store by Year and Month visitdate1.sh visitdate1.xt visitdate1.html
The number of customers leaving the store by Year and Month visitdate2.sh visitdate2.xt visitdate2.html