Synaptical Shiraz Trend Finder (Version 1.1) Help File


Introduction
Shiraz is designed to be a simple to use analysis tool for stock market speculation for the beginning investor or just the curious.  It implements a simplifed user interface for data importing/manipulation while also allowing complex analysis through automatic or manual configuration of the TrendFinder tool.  Feel free to email suggestions, comments or any feedback to Synaptical because this tool can only improve with customer feedback such as yours, Thank You.

Synaptical Devices develops commerical products based on network computation models and stresses the advancement of this technology through education and application support.  The Synaptical website contains product descriptions in addition to industry links, ftp database and various background/educational facts.  Investing on estimation of leading indicators or forward in time valuation involves inherent risk and the results are left solely to the user.  Synaptical provides no guarantee or responsibility for the use or results of this program as an investment tool.

Synaptical Devices
PO Box 990
Pasadena CA 91102
818-554-9970
www.synaptical.com

Minimum Requirements
The minimum computer requirements are:

Trend Modeling
The art of trend-finding or estimation has been described by various mathematical methods as applied in the field of science.  In astronomy, predictions of sunspot activity is common application while here at home earthquake prediction is popular.  The simpler data modelling techniques of least-squares or cubic-spline curve fitting provide a good trend estimate for past data but suffer from predicting future events.  Complex estimation methods of kalman filtering and IMM give a much better estimation capability but rely heavily on expert knowledge from the designer.  This incomplete separation of data and knowledge leads to "bias filtering" which serves to color the results and may limit the effectiveness of the prediction.  Shiraz utilizes a neural computational model well suited for being "trained" in the behavior of trends and uses this knowledge base to estimate or extrapolate trends.  Since the method is free of "data coloring" because the model is trained on real data, the only limitations in it's ability to predict will be the order or complexity of the underlying model.  Shiraz can automatically adjust the model parameters based on data input sources and prediction error results.  The user is free to override the automatic feature in the event customization is desired.

Shiraz Modeling Basics
Data analysis in Shiraz is built on training a computational network using a "window" of data as a snapshot into a longer sequence.  The organization is similar to the commonly used moving average function available on financial website as a tool for stock trend evaluation.  Historical data, such as a daily closing price, is ordered from oldest to most recent and arranged into a single column in the Shiraz grid.  The overall data set length can vary in size based on the number of unique training sub-sets available to present to the network as inputs.  A single goal value, usually the next value in the sequence, is selected as the target value given as the correct output value for a specific input data sub-set.  The size of the input training sub-set is defined by the training window size as detailed below:



The data training subset is presented as an input to the network as well as the next in sequence goal value to be compared to the actual network output value.  A training feedback term is generated based on the error between the goal and network output which serves to slightly adjust the network internal response.  After completion the training subset and goal are moved down to the next position and the process repeated until successful network training is achieved.  Training is processed using the "Training" section of the data to force the response of the network.  A seperate process called testing is used to evaluate the effectiveness of the training based on unseen data in the "Testing" section of data.  When the statistical feedback error within the testing section is below the threshold, training is terminated and completion figures of merit are calculated.  The final guage of modeling accuracy and quality is made using the "Evaluation" section of the data which represents the latest/untrained values.

Quick Start
Model building and using in Shirazis performed in two easy steps: 1) Model Training and Accuracy Step, 2) Estimation or Extrapolation Step.  Data is entered in column ordered sets for the various inputs and prediction goals associated with the problem.  Follow these instructions to quickly enter and evaluate your data.

  1. Open a new data file from the File/New menu item and give it a convienent name (Shiraz will append the "tfs" file type)
  2. Ensure that the operation mode is selected as "Model Evaluate" in the Results Mode field.
  3. If your data is in spreadsheet format then Cut (Cntl-C) & Paste (Cntl-V) into the Shiraz grid below the comment row for numeric data.  Data containing a header row with text comments should be pasted starting at the comment grid row.  Importing of ASCII type files is supported under the File/Import menu item.
  4. Fill in the comment row with descriptive comments for each column if needed.
  5. Select the type of data for each column at the I/O select row: "Input" (Model Input), "Output" (Model Goal), "Trg/In" (Input & Goal in the same column).  All other columns containing values or informational data need to be marked "Comment".  All unused columns to the right of data/comment columns should be left marked "No Data" to minimize file size.
  6. Select Actions/Auto-Config Vectors to automatically partition the data set into training, testing and evaluation section for training the model.  Importing of ASCII type files will automatically partition the sections.
  7. In the grid field labelled "Num Taps" change the length data subset used for modelling.  A good number to start with is 10% of the training vector section length.
  8. Depress the "T" button on the menu bar to start training of the model.  When training stops, exit the process if both slope & value scores are above 70.  Otherwise, press the "Continue" button until a good score is achieved.
  9. Depress the "E" button on the menu bar to evaluate model accuracy.  Good training will result in good accuracy scores slope and value.
  10. Change the Results Mode field to "Data Extrapolate" to allow estimating of new values.
  11. Retrain the model by following step 6 above in order to utilize the latest data values (highest rows) for best accuracy.  (Note that the evaluation vectors have been replaced by testing vectors)
  12. Depress the "E" button to estimate or extrapolate the next values in the data sequence which will appear at the next data rows available below the training data.  (Number of estimated values is controlled through the Configure/Results dialog box)

Modelling Details
Building a financial model in Shiraz is as simple as downloading historical data from the many internet financial websites such as Yahoo! and importing the data set into TrendFinder as simulation inputs/targets.  Model development consists of a two step procedure to first train/evaluate the accuracy and then extrapoation the next in sequence values.  Shiraz is designed to automatically handle most instances of data trend model building but all parameters for training and results display are user customizable.  The first step of training the model requires a data set to be partitioned into three sections that are used to evaluate the quality of the prediction and training.  These sections: Training, Testing and Evaluation are setup within the gridsheet and used by the tool to derive a set of coupled equations used in prediction.  Training and Testing sets are used for the actual training process, while the evaluation set allows presentation of "never seen before" data to evaluate the accuracy/level of the training process.  When a satisfactory level of accuracy in the training has been achieved next value extrapolation can be estimated with the given training parameters.

Predictions are made for both the slope and value of the next occurance of the output based on previous data group whose length is selected by the tapped memory field.  In selecting the appropriate length for each section the following rule of thumb can be used:

    Number of Memory Taps ~ Length necessary to capture data trends and any repeatable sequence
    Number of Training Vectors ~ 60% of the data set length
    Number of Testing Vectors ~ 20% to 25% of the data set length
    Number of Evaluation Vectors ~ Remaining data set values

These range of values will allow a sufficient sample size for generating training and evaluation quality estimates.  Each region of training, testing and evaluation vectors are color coded in the first grid column for easy verification of placement.  By overriding the automatic features of the TrendFinder it is possible to tweak the models training and error convergence.  Initially, the training is terminated by measuring the statistical error of the testing group to exceed a preset threshold.  In the event that this threshold is placed too high a maximum number of iterations is enforced to terminate the session.  Cummulative error values are displayed as the training process is active as well as the number of weight updates in the session.  A realistic value for minimum weight set updates is approximate 30 given the vector sizes determined above.

Two estimating methods are provide by Shiraz in order to capture trends found in financial data for equities trading.  The single pass estimator uses a pure computational network approach in fitting representive data sets working best for situations with no long term slope or deviation.  A two pass estimator gives best results for data exibiting long term trend changes by using a hybrid moving average along with a computational network.  Shiraz defaults to othe two pass estimator because it gives best results for the general case but manual selection is left open for the users.

GUI Description
The following image details the main screen of the Shiraz program for the purposes of control discussion below.


Grid Item: Start Offset - Defines the data point within the historical data buffer where training and evaluation sets start.  Use this variable to move the reference (or starting) time point for any operation.
Grid Item: Num Training - Defines the number of vectors to be used in the training process for TrendFinder.  A reasonable number of points would be between 50 and 200. 
Grid Item: Num Testing - Defines the number of vectors to be used in the testing process for TrendFinder.  A good approximation is 40-50% of the training vector size.
Grid Item: Num Eval - Defines the number of vectors to be used in the evaluation process for Trendfinder.  A good approximation is 50% of the number of testing vectors.
Grid Item: Num Taps - Defines the number of memory taps to be used in each of the training, testing and evaluation processes for Trendfinder.  A reasonable number of points would be between 5 and 15.  This value will have significant impacts on the operation of the estimate generation and time required to train.
Grid Item: State - Status indication for the state of the TrendFinder document.  States are defined as: "Init" where parameters are being setup or modified, "Trained" where the estimator is trained for the given parameter set, "Execute" where the estimator has been run using the trained configuration.  Values within the figure of merit and error bounds display are relative to the current state. 
Grid Item: Slope FM - Indicates a relative figure of merit for the quality of the slope estimate.  Value is calculated for the training and evaluation phases.
Grid Item: Value FM - Indicates a relative figure of merit for the quality of the value estimate.  Value is calculated for the training and evaluation phases.
Grid Item: Slope Sign - Indicates the percentage of matches in sign change between the target values and estimator output.  Selection of the "Err:Sign" data item from within column I/O control field will display results for each vector.  Value is calculated for the training and evaluation phases.
Grid Item: Slope Err Avg - Indicates the mean value for the slope prediction error term between target values and estimator output.  Value is calculated for the training and evaluation phases.
Grid Item:
Slope Err SD - Indicates the standard deviation value for the slope prediction error term between target values and estimator output.  Value is calculated for the training and evaluation phases.

Grid Item:
Value Err Avg - Indicates the mean value for the prediction estimate error term between target values and estimator output.  Value is calculated for the training and evaluation phases.

Grid Item:
Value Err SD - Indicates the standard deviation value for the prediction estimate error term between target values and estimator output.  Value is calculated for the training and evaluation phases.



Window: Input Nodes - Defines the number of input terms being applied to the predictor function for training and evaluation.  This term is automatically calculated by the program independent of configuration mode and is not user selectable.
Window: Output Nodes - Defines the number of output terms being applied to the predictor function for training and evaluation.  Current versions of Shiraz only support a single output term for modelling purposes and is not user selectable. 
Window: Layes - Defines the number of layers (or order) of the predictor function.  Adjust this parameter for particularly noisy data sets or when data has a rapid change in value.
Window: Elements/Layer - Defines the number of nodes to be used in each layer in the predictor function.  The first value reflects the number of input nodes and the last is the number of output nodes.  These values must match the other respective fields or an error condition will result from invalid predictor configuration.
Check Box: Auto TrendFinder Configure - Defines the mode for configuring the TrendFinder functionality.

<>Window: Iterations/Cycle - Defines the number of iterations that the training vector set is applied prior to error determination using the testing vector set.  Default value for this field is 10.
Window: Maximum Cycles - Defines the maximum number of training cycles for the session.  Default value for this field is 2000..
Window: Error Ratio Goal - Defines the statistical bounds for testing error determination to terminate the training session.  Default value for this field is 1.2.
Window: Model Learning Rate - Defines the fractional portion of the weight update value determined from the last training cycle.  Default value for this field is 0.05.
Window: Model Gain Term - Defines the feed-forward response of TrendFinder internal processing.  Default value for this field is 1.0.
Window:
Model Momentum Term - Defines the fractional portion of the weight update value determined from previous training cycles in the session.  Default value for this field is 0.5.

<>List Box: Mode - Selects the type of estimation method is used with choices of "1 Pass" or "2 Pass".  The single pass estimator produces better results for a higher order characteristic data set but can suffer for long term trend bias.  The two pass algorthm uses a unique method having dual estimators matched to differing trend lengths which accurately models fast/slow data trends.
Check Box:
Auto Training Configure - Defines the
mode for configuring the Training functionality.



Window: Interval Length - Defines the length of estimation extrapolation performed by TrendFinder during the evaluation phase.  The current version of Shiraz only support a single extrapolation point.  Default value for this field is 1.
Window: Value Avg Ratio - Displays the coefficent used in the figure of merit value for the Average term.  There are two values available for the training and evaluation phases of model building.
Window: Value Std Dev Ratio - Defines the coefficent used in the figure of merit value for Standard Deviation term.  There are two values available for the training and evaluation phases of model building.
Check Box: Comment Header Row - Control usage of the comment row available for text description of column data sets.
Window: Mode - Selects the results mode used by the program during the estimation phase.  Choices are "Evaluate" or "Extrapolate" for determining what output is produced.  The evaluate mode is used as a first step to validate the accuracy of the estimation based on unseen data from the training process.  Evaluation data appears below training/testing sections in the data set.  The extrapolate mode is used after the training accuracy has been established to predict the next occurances in the trend.


Training Operation Process Dialog - Controls the training process and provides accuracy qualitative figure of merit

Training Operation

Button: Continue - Forces a manual continuation of the training session for increased fitting of the TrendFinder parameters.  The training state of the network is maintained but the interations count is reset to zero with maximum value enforcement active.
Button: Abort - Forces a manual cancellation of the training session with termination training parameters maintained to that point.  Use to terminate non-converging training sessions prior to hitting maximum limit of iterations.  Useful for exiting training process when little or no updates are being performed leading to excessive training time.
Window: Slope Confidence Level - Provides a figure of merit for the training TrendFinder output slope relative to the modelling goal.  Figure of merit above 70% indicate a well trained operation.  This value is combination with numeric confidence level and state allow a rapid verification of training accuracy and state.
Window: Value Confidence Level -
Provides a figure of merit for the training TrendFinder output value relative to the modelling goal.  Figure of merit above 70% indicate a well trained operation.    This value is combination with numeric confidence level and state allow a rapid verification of training accuracy and state.


Training Operation Process Dialog - Provides accuracy qualitative figure of merit for accuracy of "Never Seen Before" data evaluation

Evaluation Operation

Window: Slope Confidence Level - Provides a figure of merit for the training TrendFinder output slope relative to the modelling goal.  Figure of merit above 70% indicate a well trained operation.
Window: Value Confidence Level -
Provides a figure of merit for the training TrendFinder output value relative to the modelling goal.  Figure of merit above 70% indicate a well trained operation.