...
Info |
---|
Additional tabs The following additional tabs are provided:
|
...
Info |
---|
Scenario data can be found in the Datasets folder in your Rulex installation. |
The following steps were performed:
...
Procedure | Screenshot |
---|---|
First we import the san-test dataset, retrieving attribute names from line 1 and attribute types from line 2. Each row of the dataset represents a sequence, composed by Sequence ID, the date of occurrence, and a variable number of Event IDs. | |
Then add a Reshape to Long task to the process to re-arrange the dataset, so that the information concerning a purchase of N items is distributed over N rows, with each row including a Order ID/Item ID pair. | |
Then, we connect the Sequence Analysis task to the Reshape to Long task. Configure the task as follows:
| |
The extracted frequent sequences can be seen in the Frequent Sequences tab. | |
Now we connect the Anomaly Detection task to the Sequence Analysis task, and configure it as follows:
| |
In the Compression tab of the Options panel, select Closed frequent sequences as Model compression method. | |
To check the results of the computation, right-click the task in the process and select Take a look. Supplementary attributes, generated by the Anomaly Detection task, have been generated, allowing us to determine if, with respect to the considered model, the event is an anomaly. For each anomalous event, if previous events constituting an incomplete frequent sequence involving it were detected, their IDs are printed in the Detected Event column(s) and the one which should be next is included in the Missing Event column. The timeout period after which the missing event was not detected is stored in the Timeout column. Otherwise, if the event is anomalous by itself, i.e. if it is not frequent enough to be included in the (compressed) frequent sequences model, the Detectedevent column is filled with the ID of the event itself, and both the Timeout and the Missing Event columns are left blank. |
...