Install Ivory using pip.

$ pip install ivory

Ivory Client

Ivory has the Client class that manages the workflow of machine learning. Let's create your first Client instance. In this quickstart, we are working with examples under the examples directory. Pass examples to the first argument of ivory.create_client():

import ivory

client = ivory.create_client("examples")

[3] 2020-06-20 15:23:33 (677ms) python3 (1.37s)


The representation of the client shows that it has two instances. These instances can be accessed by index notation or dot notation.

client[0]  # or client['tracker'], or client.tracker

[4] 2020-06-20 15:23:34 (3.00ms) python3 (1.37s)

Tracker(tracking_uri='file:///C:/Users/daizu/Documents/github/ivory/examples/mlruns', artifact_location=None)

The first instance is a Tracker instance that connects Ivory to MLFlow Tracking.

Because a Client instance is an iterable, you can get all of the instances by applying list() to it.


[5] 2020-06-20 15:23:34 (3.00ms) python3 (1.37s)

['tracker', 'tuner']

The second instance is named tuner.


[6] 2020-06-20 15:23:34 (3.00ms) python3 (1.37s)

Tuner(storage='sqlite://', sampler=None, pruner=None, load_if_exists=True)

A Tuner instance connects Ivory to Optuna: A hyperparameter optimization framework.

We can customize these instances with a YAML file named client.yml under the working directory. In our case, the file just contains the minimal settings.

File 1 client.yml



If you don't need any customization, the YAML file for client is not required. If there is no file for client, Ivory creates a default client with a tracker and tuner. (So, the above file is unnecessary.)

If you don't need a tracker and/or tuner, for example in debugging, use ivory.create_client(tracker=False, tuner=False).

Create NumPy data

In this quickstart, we try to predict rectangles area from their width and height using PyTorch. First, prepare the data as NumPy arrays. In rectangle/data.py under the working directory, create_data() is defined. The ivory.create_client() automatically inserts the working directory to sys.path, so that we can import the module regardless of the current directory.

Let's check the create_data() code and an example output:

File 2 rectangle/data.py

from dataclasses import dataclass

import numpy as np

import ivory.core.data
from ivory.utils.fold import kfold_split

def create_data(num_samples=1000):
    xy = 4 * np.random.rand(num_samples, 2) + 1
    xy = xy.astype(np.float32)
    dx = 0.1 * (np.random.rand(num_samples) - 0.5)
    dy = 0.1 * (np.random.rand(num_samples) - 0.5)
    z = ((xy[:, 0] + dx) * (xy[:, 1] + dy)).astype(np.float32)
    return xy, z

class Data(ivory.core.data.Data):
    n_splits: int = 4

    DATA = create_data(1000)  # Shared by each run.

    def init(self):  # Called from self.__post_init__()
        self.input, self.target = self.DATA
        self.index = np.arange(len(self.input))
        # Extra fold for test data.
        self.fold = kfold_split(self.input, n_splits=self.n_splits + 1)

        # Creating dummy test data just for demonstration.
        is_test = self.fold == self.n_splits  # Use an extra fold.
        self.fold[is_test] = -1  # -1 for test data.
        self.target = self.target.copy()  # n_splits may be different among runs.
        self.target[is_test] = np.nan  # Delete target for test data.

        self.target = self.target.reshape(-1, 1)  # (sample, class)

def transform(mode, input, target):
    return input, target.reshape(-1)
import rectangle.data

xy, z = rectangle.data.create_data(4)

[8] 2020-06-20 15:23:34 (4.00ms) python3 (1.38s)

array([[2.2804623, 1.3581246],
       [1.453418 , 3.905157 ],
       [4.4610925, 2.598797 ],
       [4.1255593, 3.9046824]], dtype=float32)

[9] 2020-06-20 15:23:34 (4.00ms) python3 (1.39s)

array([ 3.1574094,  5.7998157, 11.530496 , 16.19652  ], dtype=float32)

ivory.utils.fold.kfold_split() creates a fold array.

import numpy as np
from ivory.utils.fold import kfold_split

kfold_split(np.arange(10), n_splits=3)

[10] 2020-06-20 15:23:34 (4.00ms) python3 (1.39s)

array([2, 1, 0, 2, 0, 2, 1, 1, 0, 0], dtype=int8)

Set of Data Classes

Ivory defines a set of base classes for data (Data, Dataset, Datasets, and DataLoaders) that user's custom classes can inherit. But now, we use the Data only.

Now, we can get a rectangle.data.Data instance.

data = rectangle.data.Data()

[11] 2020-06-20 15:23:34 (4.00ms) python3 (1.40s)

Data(train_size=800, test_size=200)
data.get(0)  # get data of index = 0.

[12] 2020-06-20 15:23:34 (5.00ms) python3 (1.40s)

 array([2.0772183, 4.9461417], dtype=float32),
 array([10.418284], dtype=float32))

The returned value is a tuple of (index, input, target). Ivory always keeps data index so that we can know where a sample comes from.

Define a model

We use a simple MLP model. Note that the number of hidden layers and the size of each hidden layer are customizable.

File 3 rectangle/torch.py

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self, hidden_sizes):
        layers = []
        for in_features, out_features in zip([2] + hidden_sizes, hidden_sizes + [1]):
            layers.append(nn.Linear(in_features, out_features))
        self.layers = nn.ModuleList(layers)

    def forward(self, x):
        for layer in self.layers[:-1]:
            x = F.relu(layer(x))
        return self.layers[-1](x)

Parameter file for Run

Ivory configures a run using a YAML file. Here is a full example.

File 4 torch.yaml

library: torch
    class: rectangle.data.Data
    n_splits: 4
  fold: 0
  class: rectangle.torch.Model
  hidden_sizes: [20, 30]
  class: torch.optim.SGD
  params: $.model.parameters()
  lr: 1e-3
  class: torch.optim.lr_scheduler.ReduceLROnPlateau
  optimizer: $
  factor: 0.5
  patience: 4
  metric: val_loss
  patience: 10
  loss: mse
  batch_size: 10
  epochs: 10
  shuffle: true
  verbose: 2

Let's create a run calling the Client.create_run().

run = client.create_run('torch')

[14] 2020-06-20 15:23:34 (298ms) python3 (1.97s)

[I 200620 15:23:34 tracker:48] A new experiment created with name: 'torch'
Run(id='7b9e0effe2c84d1b90a4def042be41a1', name='run#0', num_instances=12)


Client.create_run(<name>) creates an experiment named <name> if it hasn't existed yet. By clicking an icon () in the above cell, you can see the log.

Or you can directly create an experiment then make the experiment create a run:

experiment = client.create_experiment('torch')
run = experiment.create_run()

A Run instance have an attribute params that holds the parameters for the run.

import yaml

print(yaml.dump(run.params, sort_keys=False))

[15] 2020-06-20 15:23:34 (7.00ms) python3 (1.98s)

      class: rectangle.data.Data
      n_splits: 4
      def: ivory.torch.data.Dataset
    fold: 0
    class: ivory.core.data.Datasets
    class: rectangle.torch.Model
    - 20
    - 30
    class: torch.optim.SGD
    params: $.model.parameters()
    lr: 0.001
    class: torch.optim.lr_scheduler.ReduceLROnPlateau
    optimizer: $
    factor: 0.5
    patience: 4
    class: ivory.torch.results.Results
    class: ivory.torch.metrics.Metrics
    metric: val_loss
    class: ivory.callbacks.monitor.Monitor
    patience: 10
    class: ivory.callbacks.early_stopping.EarlyStopping
    loss: mse
    batch_size: 10
    epochs: 10
    shuffle: true
    verbose: 2
    class: ivory.torch.trainer.Trainer
  class: ivory.torch.run.Run
  name: run#0
  id: 7b9e0effe2c84d1b90a4def042be41a1
  name: torch
  class: ivory.core.base.Experiment
  id: '1'

This is similar to the YAML file we read before, but has been slightly changed.

  • Run and experiment keys are inserted.
  • Run name is assigned by Ivory Client.
  • Experiment ID and Run ID are assigned by MLFlow Tracking.
  • Default classes are specified, for example the ivory.torch.trainer.Trainer class for a trainer instance.

The Client.create_run() can take keyword arguments to modify these parameters:

run = client.create_run(
  'torch', fold=3, hidden_sizes=[40, 50, 60],

print(yaml.dump(run.params['run']['datasets'], sort_keys=False))
print(yaml.dump(run.params['run']['model'], sort_keys=False))

[16] 2020-06-20 15:23:34 (43.0ms) python3 (2.02s)

  class: rectangle.data.Data
  n_splits: 4
  def: ivory.torch.data.Dataset
fold: 3
class: ivory.core.data.Datasets

class: rectangle.torch.Model
- 40
- 50
- 60

Train a model

Once you got a run instance, then all you need is to start it.

run = client.create_run('torch')  # Back to the default settings.

[17] 2020-06-20 15:23:34 (1.28s) python3 (3.30s)

[epoch#0] loss=21.26 val_loss=7.393 lr=0.001 best
[epoch#1] loss=7.776 val_loss=6.682 lr=0.001 best
[epoch#2] loss=6.895 val_loss=5.736 lr=0.001 best
[epoch#3] loss=6.053 val_loss=5.007 lr=0.001 best
[epoch#4] loss=5.333 val_loss=4.328 lr=0.001 best
[epoch#5] loss=4.529 val_loss=3.47 lr=0.001 best
[epoch#6] loss=3.624 val_loss=2.702 lr=0.001 best
[epoch#7] loss=2.863 val_loss=2.34 lr=0.001 best
[epoch#8] loss=2.1 val_loss=1.446 lr=0.001 best
[epoch#9] loss=1.599 val_loss=1.051 lr=0.001 best

The history of metrics is saved as the history attribute of a run.metrics instance.


[18] 2020-06-20 15:23:36 (4.00ms) python3 (3.31s)

Dict(['loss', 'val_loss', 'lr'])

[19] 2020-06-20 15:23:36 (4.00ms) python3 (3.31s)

{0: 7.392936539649964,
 1: 6.681512886285782,
 2: 5.736435055732727,
 3: 5.0073208689689634,
 4: 4.32755663394928,
 5: 3.4701678693294524,
 6: 2.7017099022865296,
 7: 2.3395130217075346,
 8: 1.4459084510803222,
 9: 1.0509242355823516}

Also the model output and target are automatically collected in a run.results instance.


[20] 2020-06-20 15:23:36 (4.00ms) python3 (3.32s)

Results(['train', 'val'])

[21] 2020-06-20 15:23:36 (4.00ms) python3 (3.32s)

array([[9.240469 ],
       [9.394586 ],
       [8.627342 ],
       [4.2040167]], dtype=float32)

[22] 2020-06-20 15:23:36 (4.00ms) python3 (3.32s)

array([[9.582911 ],
       [8.077265 ],
       [8.156037 ],
       [3.401937 ]], dtype=float32)

Test a model

Testing a model is as simple as training. Just call Run.start('test') instead of a (default) 'train' argument.


[23] 2020-06-20 15:23:36 (38.0ms) python3 (3.36s)

Results(['train', 'val', 'test'])

As you can see, test results were added.


[24] 2020-06-20 15:23:36 (4.00ms) python3 (3.37s)

       [ 7.707018],
       [15.816204]], dtype=float32)

Off course the target values for the test data are np.nan.


[25] 2020-06-20 15:23:36 (4.00ms) python3 (3.37s)

       [nan]], dtype=float32)

Task for multiple runs

Ivory implements a special run type called Task that controls multiple nested runs. A task is useful for parameter search or cross validation.

task = client.create_task('torch')

[26] 2020-06-20 15:23:36 (49.0ms) python3 (3.42s)

Task(id='d7765754a1774d4da508252590cd0ac9', name='task#0', num_instances=3)

The Task class has two functions to generate multiple runs: Task.prodcut() and Task.chain(). These two functions have the same functionality as itertools of Python starndard library. Let's try to perform cross validation.

runs = task.product(fold=range(4), verbose=0, epochs=3)

[27] 2020-06-20 15:23:36 (3.00ms) python3 (3.42s)

<generator object Task.product at 0x000001404B6F95C8>

Like itertools's functions, Task.prodcut() and Task.chain() return a generator, which yields runs that are configured by different parameters you specify. In this case, this generator will yield 4 runs with a fold number ranging from 0 to 3 for each. A task instance doesn't start any training by itself. In addition, you can pass fixed parameters to update the original parameters in the YAML file.

Then start 4 runs by a for loop including run.start('both'). Here 'both' means successive test after training.

for run in runs:

[28] 2020-06-20 15:23:36 (2.14s) python3 (5.56s)

[run#3] epochs=3 fold=0
[run#4] epochs=3 fold=1
[run#5] epochs=3 fold=2
[run#6] epochs=3 fold=3

Collect runs

Our client has a Tracker instance. It stores the state of runs in background using MLFlow Tracking. The Client provides several functions to access the stored runs. For example, Client.search_run_ids() returns a generator that yields Run ID assigned by MLFlow Tracking.

# A helper function.
def print_run_info(run_ids):
    for run_id in run_ids:
        print(run_id[:5], client.get_run_name(run_id))

[29] 2020-06-20 15:23:38 (3.00ms) python3 (5.56s)

run_ids = client.search_run_ids('torch')  # Yields all runs of `torch`.

[30] 2020-06-20 15:23:38 (81.7ms) python3 (5.64s)

05ac5 run#6
d9d3d run#5
88e6b run#4
b9b09 run#3
d7765 task#0
63398 run#2
02178 run#1
7b9e0 run#0

For filtering, add key-value pairs.

# If `exclude_parent` is True, parent runs are excluded.
run_ids = client.search_run_ids('torch', fold=0, exclude_parent=True)

[31] 2020-06-20 15:23:38 (162ms) python3 (5.80s)

b9b09 run#3
63398 run#2
7b9e0 run#0
# If `parent_run_id` is specified, nested runs with the parent are returned.
run_ids = client.search_run_ids('torch', parent_run_id=task.id)

[32] 2020-06-20 15:23:38 (45.0ms) python3 (5.85s)

05ac5 run#6
d9d3d run#5
88e6b run#4
b9b09 run#3

Client.get_run_id() and Client.get_run_ids() fetch Run ID from run name, more strictly, a key-value pair of (run class name in lower case, run number).

run_ids = [client.get_run_id('torch', run=0),
           client.get_run_id('torch', task=0)]

[33] 2020-06-20 15:23:38 (53.0ms) python3 (5.90s)

7b9e0 run#0
d7765 task#0
run_ids = client.get_run_ids('torch', run=range(2, 4))

[34] 2020-06-20 15:23:38 (54.0ms) python3 (5.96s)

63398 run#2
b9b09 run#3

Load runs and results

A Client instance can load runs. First select Run ID(s) to load. We want to perform cross validation here, so that we need a run collection created by the task#0. In this case, we can use Client.get_nested_run_ids(). Why don't we use Client.search_run_ids() as we did above? Because we don't have an easy way to get a very long Run ID after we restart a Python session and lose the Task instance. On the other hand, a run name is easy to manage and write.

# Assume that we restarted a session so we have no run instances now.
run_ids = list(client.get_nested_run_ids('torch', task=0))

[35] 2020-06-20 15:23:38 (76.7ms) python3 (6.03s)

05ac5 run#6
d9d3d run#5
88e6b run#4
b9b09 run#3

Let's load the latest run.

run = client.load_run(run_ids[0])

[36] 2020-06-20 15:23:38 (47.0ms) python3 (6.08s)

Run(id='05ac5e6f875e41688e74f9f89108a55a', name='run#6', num_instances=11)

Note that the Client.load_run() doesn't require an experiment name because Run ID is UUID.

As you expected, the fold number is 3.


[37] 2020-06-20 15:23:39 (4.00ms) python3 (6.08s)


By loading a run, we obtain the pretrained model.


[38] 2020-06-20 15:23:39 (5.00ms) python3 (6.09s)

  (layers): ModuleList(
    (0): Linear(in_features=2, out_features=20, bias=True)
    (1): Linear(in_features=20, out_features=30, bias=True)
    (2): Linear(in_features=30, out_features=1, bias=True)
import torch

index, input, target = run.datasets.val[:5]
with torch.no_grad():
    output = run.model(torch.tensor(input))

[39] 2020-06-20 15:23:39 (9.00ms) python3 (6.10s)

[[ 5.89771 ]
 [ 8.328902]
[[ 2.709134]
 [ 5.110198]

If you don't need a whole run instance, Client.load_instance() is a better choice to save time and memory.

results = client.load_instance(run_ids[0], 'results')

[40] 2020-06-20 15:23:39 (27.0ms) python3 (6.12s)

Results(['train', 'val', 'test'])
for mode, result in results.items():
    print(mode, result.output.shape)

[41] 2020-06-20 15:23:39 (8.00ms) python3 (6.13s)

train (600, 1)
val (200, 1)
test (200, 1)

For cross validation, we need 4 runs. In order to load multiple run's results at the same time, the Ivory Client provides a convenient function.

results = client.load_results(run_ids, verbose=False)  # No progress bar.

[42] 2020-06-20 15:23:39 (92.8ms) python3 (6.23s)

Results(['val', 'test'])
for mode, result in results.items():
    print(mode, result.output.shape)

[43] 2020-06-20 15:23:39 (6.00ms) python3 (6.23s)

val (800, 1)
test (800, 1)


Client.load_results() drops train data for saving memory.

The lengths of the validation and test data are both 800 (200 times 4). But be careful about the test data. The length of unique samples should be 200 (one fold size).

import numpy as np

len(np.unique(results.val.index)), len(np.unique(results.test.index))

[44] 2020-06-20 15:23:39 (4.00ms) python3 (6.24s)

(800, 200)

Usually, duplicated samples in test data are averaged for ensembling. Results.mean() performs this mean reduction and returns a newly created Rusults instance.

reduced_results = results.mean()
for mode, result in reduced_results.items():
    print(mode, result.output.shape)

[45] 2020-06-20 15:23:39 (14.0ms) python3 (6.25s)

val (800, 1)
test (200, 1)

Compare these two results.

index = results.test.index
index_0 = index[0]
x = results.test.output[index == index_0]
print("-> mean:", np.mean(x))

index = reduced_results.test.index
x = reduced_results.test.output[index == index_0]

[46] 2020-06-20 15:23:39 (10.0ms) python3 (6.26s)

-> mean: 13.69367

For convenience, The Client.load_results() has a reduction keyword argument.

results = client.load_results(run_ids, reduction='mean', verbose=False)

[47] 2020-06-20 15:23:39 (84.8ms) python3 (6.34s)

Results(['val', 'test'])
for mode, result in results.items():
    print(mode, result.output.shape)

[48] 2020-06-20 15:23:39 (6.00ms) python3 (6.35s)

val (800, 1)
test (200, 1)

The cross validation (CV) score can be calculated as follows:

true = results.val.target
pred = results.val.output
np.mean(np.sqrt((true - pred) ** 2))  # Use any function for your metric.

[49] 2020-06-20 15:23:39 (4.00ms) python3 (6.35s)


And we got prediction for the test data using 4 MLP models.


[50] 2020-06-20 15:23:39 (5.00ms) python3 (6.36s)

array([[13.69367 ],
       [ 9.086484],
       [13.991818]], dtype=float32)


In this quickstart, we learned how to use the Ivory library to perform machine learning workflow. For more details see the Tutorial.