Algorithm description

This file contains a description of the learn and predict algorithms, in particular the initialization which can look convoluted. A lot of steps and verifications are needed before an actual computation can be run.

General overview

A run of Kernalytics is composed of the following steps.

Reading the data

This is not method specific. Every numerical method read the data the same way, the objective is to generate the Gram matrix. There are only two files to describe this, Learn and Predict. Upon parsing the method name, a specific method is called that will read the parameters and run the numerical method on the data;

Reading the parameters and running the method

This is similar but different across numerical methods. What differs is the parameters description. Each method (regression, segmentation,...) has its own set of parameters. This is described in the learn and predict directories. Once the parameters have been parsed, the numerical method main function is called with the data and parameters. The result of the analysis is returned.

The method run is described in the overview.

The last step is to write all the results as csv files in the root folder of the analysis case.

Examples

You can find examples for various algorithms in Examples.scala. They all follow the same structure, a root folder is defined (located in exec), and either Learn or Predict is called.

Kernalytics only works with structured data in the form of a collection of csv files. Even the rscala packages is just a thin wrapper to transmit the root folder location. The root folder is the place where they should be found.

Detailed call sequence

A description of Learn, and the differences with Predict will then be highlighted.

exec.Learn.main

readAndParseFile: read the algo.csv file, check the content and generate a map key => value with an entry for each column.
readAndParseVars: read the learnData.csv or learnPredict.csv file, and generate a tuple (Array[ParsedVar], Index). First element contains the parsed variable, the second the number of observations.
cacheGram: parse the gramOpti option in algo.csv.
readAndParseParam: read the desc.csv file, parse each column and generate an Array[ParsedParam].
generateGlobalKerEval: from the Array[ParsedVar] and Array[ParsedParam] previously generated, generate the KerEval object
- linkParamToData: link each kernel to a variable, in a KerEvalFuncDescription, which merges kernel and data information.
- multivariateKerEval: from the list of KerEvalFuncDescription, generate the kernel function (Index, Index) => Real
  - generateKernelFromParamData: for each KerEvalFuncDescription, generate a function (Index, Index) => Real
  - linearCombKerEvalFunc, aggregate the individual (Index, Index) => Real in a global (Index, Index) => Real
- then use the gramOpti option to generate a KerEval object which uses the cache method provided by the user in algo.csv
callAlgo: parse the algo entry in algo.csv to launch the corresponding numerical method, for example KMeans in the next line
- main: read and write parameters and data that are specific for the algorithm. For KMeans, it is the number of classes asked and the number of iterations the algorithm must be run. Then call the main function in the numerical method.
  - getNClass: number of classes asked for
  - getNIteration: number of iterations
  - runKMeans: code of the main algorithm. It assumes that all the data provided have been read and validated.
  - writeResults: write the results on the disk.

exec.Predict.main

Essentially similar to exec.Learn.main. The main differences are:

the gramOpti parameter in algo.csv is ignored. Direct() is always used instead.
- in prediction, the gram matrix has dimension (nObsLearn + nObsPredict) x (nObsLearn + nObsPredict), and usually each coefficient is used only once
readAndParseVars2Files is called instead of readAndParseVars, in order to generate a Gram matrix that combines learn and prediction data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

algoDesc.md

algoDesc.md

Algorithm description

General overview

Reading the data

Reading the parameters and running the method

Examples

Detailed call sequence

exec.Learn.main

exec.Predict.main

Files

algoDesc.md

Latest commit

History

algoDesc.md

File metadata and controls

Algorithm description

General overview

Reading the data

Reading the parameters and running the method

Examples

Detailed call sequence

exec.Learn.main

exec.Predict.main