Quick tutorial and help

Download pdf

Commands and usage

Use command menu to explore Testarium functions.
Don't forget about '-h'.

Introduction

Testarium paradigm

Typical scientific template of the experiment setup:

Project structure

Config.json

Here is an example of config and an example of variable names. It's a good idea to separate objects, subclasses, properties by "." (eg.: "object.property").

{
	"float"			: 1.0,
	"bool"			: 1,
	"int"			: 10,
	"path"			: "/some/path",
	"paramName"		: 0.5,
	"object.property"	: 42,

	"testarium.commitDirectory":	
		".testarium/default/20141110.193532"
}
Keep in mind, Testarium can add some woking stuff to commit configs (eg.: "testarium.commitDirectory").

Desc.json

Desc (desc.json) is a part of commit and is stored in commit directory. It contains experiment description: scores, comment, name, branch name, duration time and user info returning from MyScore().

{
	"comment": "xxx",   
	"name": "20141110.193532",   
	"score": 0.4899,   
	"params": "{ 'a' : 0.5 }",   
	"branch": "default",   
	"duration": 0.003
}

Mercurial and git bind

If you create mercurial repository in your project directory, testarium will commit mercurial/git when "run" is called. Testarium detects git/mercurial due to existing .git/.hg directory. If you don't want to make auto commits in your git/mercurial, just put this string into the .testarium/testarium.json:

"coderepos.use": false

Integration

Setup

Last and unstable version:

$ pip install http://testarium.makseq.com/download.html

Stable version (from pypi):

$ pip install testarium

Quick start

project/config/config.json
project/example.py
$ python example.py run
t> New commit: 20141220.162326
t> 20141220.162326 > branch: default > score: 1.06 > time: 0.0

Advanced print

Use your own representation of commits in console and web. It affects such commands as log, diff, where and web interface.

@testarium.testarium.set_print
def MyPrint(commit):
	try: a = str(commit.config['a'])
	except: a = ''
	
	score = str(commit.desc['score'])
	return ['name', 'a', 'score'], [commit.name, a, score]

Or more advanced print for web version (with images and plots):

@testarium.testarium.set_print
def print_web(commit):
    h = ['name', 'score', 'time', 'comment', 'loss', 'img1']
    b = [commit.name,
        '%0.3f' % (commit.desc['score']),
        '%0.0f'%(commit.desc['duration']),
        str(commit.desc['comment']).replace('{','').replace('}','').replace('"','').replace('[','').replace(']',''),
        'graph://storage/'+commit.dir+'/plot_loss.json', # loss plot
        'image://storage/'+commit.dir+'/images/0.svg']  # image 
    return h, b

Advanced commits compare

Use alternative metrics to commit compare. It affects such commands as log, diff, where and web interface.

@testarium.testarium.set_compare
def MyCompare(self, other):
# self and other are commit instances

	if self._init: # commit is exist
		self_score = self.desc['score']
		other_score = other.desc['score']
		
		if self_score > other_score: return -1;
		elif self_score < other_score: return 1;
		else: return 0
		
	# it will be used for the worst result
	else: return -1 

File DB

This is the template example only.

def MyFileInfoExtractor(fname):
    '''
        Prepare dict for meta data about each file
    '''
    try:
        path = os.path.normpath(fname)
        d = {'duration': 3}

        if 'ivan' in fname:
            d.update({'targets': {'ivan': 1}})  # this file is 'ivan' file
        else:
            d.update({'targets': {'noname': 1}}) # without speaker name
            
    except Exception as e:
        print e
        return None
    return d  # return dict with meta info about file

@testarium.experiment.set_run
def MyRun(commit):
    exclude = []  # files to exclude
    added, exist, excluded = commit.filedb.ScanDirectoryRecursively(test_dir, c['data.ext'], exclude, MyFileInfoExtractor)
    commit.filedb.ShuffleFiles()

    test_ids = commit.filedb.GetFilesPortion(c['data.test.part'])  # from 0.0 (0%) to 1.0 (100%)
    test_files = [commit.filedb.GetPath(id_) for id_ in test_ids]

    train_ids = commit.filedb.GetFilesPortion(c['data.train.part'])  # from 0.0 (0%) to 1.0 (100%)
    train_files = [commit.filedb.GetPath(id_) for id_ in train_ids]

    print added, 'new files found,', exist, 'old files,', excluded, 'excluded'

    # ... TRAIN ...
    # train_ids synced with test array

    probs = clf.predict_proba(test)
    for i, p in enumerate(probs):
        commit.meta.SetMeta(test_ids[i], {'probs': {'ivan': p[1]}})
        
@testarium.experiment.set_score
def MyScore(commit):
    # testarium will calculate fafr automatically based on meta info from filedb
    try: desc = testarium.score.fafr.Score(commit)  
    except: desc = {'score': -1}
    return desc
Fork me on GitHub