IVORY.CORE.DATA
Ivory uses four classes for data presentation: Data
, Dataset
, Datasets
,
and DataLoaders
.
Basically, you only need to define a class that is a subclass of Data
and use original Dataset
and Datasets
. An example parameter YAML file is:
datasets:
data:
class: your.Data # a subclass of ivory.core.data.Data
dataset:
fold: 0
But if you need, you can define your Dataset
and/or Datasets
.
datasets:
class: your.Datasets # a subclass of ivory.core.data.Datasets
data:
class: your.Data # a subclass of ivory.core.data.Data
dataset:
def: your.Dataset # a subclass of ivory.core.data.Dataset
fold: 0
The DataLoaders
is used internally by ivory.torch.trainer.Trainer
or
ivory.nnabla.trainer.Trainer
classes to yield a minibatch in training loop.
Use a 'def'
key for dataset
instead of 'class'
.
See Tutorial
ivory.core.data.
Data
(
)
Base class to provide data to a Dataset
instance.
To make a subclass, you need to assign the following attributes in
the Data.init()
:
index
: Index of samples.input
: Input data.target
: Target data.fold
: Fold number.
get
(
index
)
(tuple) — Returns a tuple of (index
,input
,target
) according to the index.</>get_index
(
mode
,fold
)
(ndarray) — Returns index according to the mode and fold.</>get_input
(
index
)
— Returns input data.</>get_target
(
index
)
— Returns target data.</>init
(
)
— Initializesindex
,input
,target
, andfold
attributes.</>
init
(
)
Initializes index
, input
, target
, and fold
attributes.
The fold number of test data must be -1
.
For regression
def init(self):
self.index = np.range(100)
self.input = np.random.randn(100, 5)
self.target = np.random.randn(100)
self.fold = np.random.randint(5)
self.fold[80:] = -1
For classification
def init(self):
self.index = np.range(100)
self.input = np.random.randn(100, 5)
self.target = np.random.randint(100, 10)
self.fold = np.random.randint(5)
self.fold[80:] = -1
get_index
(
mode
, fold
)
→ ndarray
Returns index according to the mode and fold.
mode
(str) — Mode name:'train'
,'val'
, or'test'
.fold
(int) — Fold number.
get_input
(
index
)
Returns input data.
By default, this function returns self.input[index]
. You can override this
behavior in a subclass.
index
(int or 1D-array) — Index.
get_target
(
index
)
Returns target data.
By default, this function returns self.target[index]
. You can override this
behavior in a subclass.
index
(int or 1D-array) — Index.
get
(
index
)
→ tuple
Returns a tuple of (index
, input
, target
) according to the index.
index
(int or 1D-array) — Index.
ivory.core.data.
Dataset
(
data
, mode
, fold
, transform=None
)
Dataset class represents a set of data for a mode and fold.
data
(Data) —Data
instance that provides data toDataset
instance.mode
(str) — Mode name:'train'
,'val'
, or'test'
.fold
(int) — Fold number.transform
(callable, optional) — Callable to transform the data.
The transform
must take 2 or 3 arguments: (mode
, input
, optional
target
) and return a tuple of (input
, optional target
).
data
(Data) —Data
instance that provides data toDataset
instance.fold
(int) — Fold number.mode
(str) — Mode name:'train'
,'val'
, or'test'
.transform
(callable, optional) — Callable to transform the data.
init
(
)
Called at initialization. You can add any process in a subclass.
get
(
index=None
)
→ tuple
Returns a tuple of (index
, input
, target
) according to the index.
If index is None
, reutrns all of the data.
index
(int or 1D-array, optional) — Index.
sample
(
n=0
, frac=0.0
)
→ tuple
Returns a tuple of (index
, input
, target
) randomly sampled.
n
(int, optional) — Size of sampling.frac
(float, optional) — Ratio of sampling.
ivory.core.data.
Datasets
(
data
, dataset
, fold
)
Dataset class represents a collection of Dataset
for a fold.
data
(Data) —Data
instance that provides data toDataset
instance.dataset
(callable) — Dataset factory.fold
(int) — Fold number.
data
(Data) —Data
instance that provides data toDataset
instance.dataset
(callable) — Dataset factory.fold
(int) — Fold number.test
(Dataset) — Test dataset.train
(Dataset) — Train dataset.val
(Dataset) — Validation dataset.
ivory.core.data.
DataLoaders
(
datasets
, batch_size
, shuffle
)
DataLoaders class represents a collection of DataLoader
.
datasets
(Datasets) —Datasets
instance.batch_size
(int) — Batch_sizeshuffle
(bool) — If True, train dataset is shuffled. Validation and test dataset are not shuffled regardless of this value.
test
(Dataset) — Test dataset.train
(Dataset) — Train dataset.val
(Dataset) — Validation dataset.