Japanese Site 
  Overview

Concept
XMLTable

Applications

Download
Requirements

Documentation

User
Developer
Benchmark

Training

MUSASHI Tutorial
Data Mining Studies
Artificial Data

Development

Bug Reports
Roadmap
CVS
Join Us

 

MUSASHI tutorial

Making use of MUSASHI, it processes the sale data freely, the program which responds to various purposes (the script) it designates that description ability is learned as purpose. With this tutorial, the various paragraphs are prepared according to skill level and theme, in addition in the respective paragraph in order to be able to keep recommending study gradually, are constituted from the plural lessons, it increases. In order to begin this tutorial, the installation of MUSASHI-CORE and the data is necessary as a front preparation. "It is such a trouble. . . "Distantly method of thinking CAI which is prepared in the MUSASHI-CAI server (Computer Assisted Instruction) being similar, recommends that you study concerning MUSASHI (presently while preparing). Because this tutorial as much as possible uses many sample data and the script, really it does not operate and also the ÇM is formed, in order for the document to be able to understand just is read. The time when it is necessary in order to advance the tutorial per 1 lesson is approximately 30 - 50 minutes. In addition it can download the script and the result file which are drawn up with the tutorialfrom here.


Session 1: Before you begin:

Please revise the following key prerequisite before proceeding to the tutorial:

System important matter

Note) data environment: With the tutorial below, it designates that various data are installed under the /mnt/h00/tutorial directory as prerequisite. Below /mnt if usually there is no root authority, because it cannot modify, the person who does not have root authority, please draws up the suitable directory, houses the data there. However in that case, in explaining the tutorial, please read "the /mnt/h00/tutorial" directory, in the directory which was drawn up anew and can apply.

The knowledge which becomes prerequisite

Previous production industry

Session 2: Basic Techniques

Basic Commands

Now that you have successfully configured your machine to be used with MUSASHI, let's move onto some hands-on basic operations of MUSASHI.

Lesson 1. Selection of attributes (xtcut)
Lesson 2. Aggregation I (xtagg)
Lesson 3. Aggregation II (xtcount)
Lesson 4. Sort(xtsort)
Lesson 5. Extract substrings (xtsubstr)
Lesson 6. Record selection I(xtsel)
Lesson 7. Record selection II (xtselstr)
Lesson 8. Delete duplicate keys (xtuniq)
Lesson 9. Calculation among records 1 (xtshare)
Lesson 10. Calculation  among records 2 (xtaccum)
Lesson 11. Calculation among attributes (xtcal)
Lesson 12. Concatenating records (xtcat)
Lesson 13. Join (xtjoin)

Basic reports

 

A report provides a method to summarize database records and present analysis results. Reports can perform and display calculations that would otherwise be difficult or impossible to achieve on tables and forms. This session outlines how reports can be generated by combining xt commands, additional operations of MUSASHI basic commands will be covered in further detail.

You may create a new directory for storing data and output used in this session, or continue working on the previous directory.

Lesson 1. The most popular brands purchased
Lesson 2. Products purchased along with the most popular brands
Lesson 3. Number of visits per customer I
Lesson 4. Average number of days in between visits
Lesson 5. Customer's first visit date, last visit date, and period of visit
Lesson 6. Percentage of gross margin for each customer
Lesson 7. Percentage of sales volume for customers with membership
Lesson 8. Sales volume with respect to different age group
 

Session 3: Advanced Techniques

Advanced Commands

Once you have mastered the previous sessions on using the basic commands and reporting functions by combining the use of several commands, you are ready to move on. In this session, you will be exposed to more advanced commands for data transformation, data selection, joining files, calculation, and creating clusters.

Lesson 1: Record Selection(xtselstr)
Lesson 2: Record partition
Lesson 3: Replacing null values(xtnulto)  
Lesson 4: Replacing character strings 1(xtchgstr)
Lesson 5: Replacing character strings 2(xtsed)
Lesson 6: Replacing numeric values by character strings(xtchgnum)
Lesson 7: Random number generation(xtrand)
Lesson 8: Joining two data files with common key attributes(xtnjoin)
Lesson 9: Direct product operation(xtproduct)
Lesson 10: Selecting common records(xtcommon)
Lesson 11: Bucket partition of numeric values in uniform ranges(xtbucket)
Lesson 12: Arithmitic calculation(xtcal)
Lesson 13: Generating combinations(xtcombi)
Lesson 14: Inserting Unix command(xt2txt)

Advanced Reports

Lesson 1: Number of visits per customer II

XML data handling compilation

Data mining-related command compilation

  1. Classification (xtclassify)
  2. Association rule (xtasrule)
  3. Clustering (xtkmean)
  4. Number of cases of substring and the sub sequence is calculated, (xtcntseq)
  5. Picture territory division (xtregionseg)
  6. The taxonomic model by the majority which is based on territory rule (xtregionvote)
  7. Case-Based Reasoning (xtcbr)
  8. Binomial ÇéÇ´Ç¿ÇÄ regression model (xtlogit)
  9. Multipurpose be inherited algorithm (xtmoga)
  1. Compilation of training data and test data
  2. Compilation of data for intersection official approval
  3. Existing taxonomic model (PMML) utilization

Session 4: Data Transformation Commands

MUSASHI embeds the function to convert xmlTable output data generatedto text, html and xml data respectively. The following tutorial will walk you through how this can be done.

  1. The text file is converted to xmlTable
  2. "*" Converting the NULL value of the text file
  3. The fixed character string is attached to a certain item
  4. The number of items is checked

Scenario compilation

  1. RFM analysis
  2. Superior customer analysis
  3. Market basket analysis
  4. Related strength analysis (brand switch analysis)
MUSASHI publications development team related links mailing list user group