sconce.data_feeds package¶
sconce.data_feeds.base module¶
-
class
sconce.data_feeds.base.
DataFeed
(data_loader)[source]¶ Bases:
object
A thin wrapper around a
DataLoader
that automatically yields tuples oftorch.Tensor
(that live on cpu or on cuda). A DataFeed will iterate endlessly.Like the underlying
DataLoader
, a DataFeed’s__next__
method yields two values, which we refer to as the inputs and the targets.Parameters: data_loader ( DataLoader
) – the wrapped data_loader.-
batch_size
¶ the wrapped data_loader’s batch_size
-
cuda
(device=None)[source]¶ Put the inputs and targets (yielded by this DataFeed) on the specified device.
Parameters: device (int or bool or dict) – if int or bool, sets the behavior for both inputs and targets. To set them individually, pass a dictionary with keys {‘inputs’, ‘targets’} instead. See torch.Tensor.cuda()
for details.Example
>>> g = DataFeed.from_dataset(dataset, batch_size=100) >>> g.cuda() >>> g.next() (Tensor containing: [torch.cuda.FloatTensor of size 100x1x28x28 (GPU 0)], Tensor containing: [torch.cuda.LongTensor of size 100 (GPU 0)]) >>> g.cuda(False) >>> g.next() (Tensor containing: [torch.FloatTensor of size 100x1x28x28], Tensor containing: [torch.LongTensor of size 100]) >>> g.cuda(device={'inputs':0, 'targets':1}) >>> g.next() (Tensor containing: [torch.cuda.FloatTensor of size 100x1x28x28 (GPU 0)], Tensor containing: [torch.cuda.LongTensor of size 100 (GPU 1)])
-
classmethod
from_dataset
(dataset, split=None, validation_transform=None, **kwargs)[source]¶ Create a DataFeed from an instantiated dataset.
Parameters: - dataset (
Dataset
) – the pytorch dataset. - validation_transform (callable) – override the existing validation transform with this.
- split (float, optional) – If not
None
, it specifies the fraction of the dataset that should be placed into the first of two data_feeds. The remaining data is used for the second data_feed. Both data_feeds will be returned. - **kwargs – passed directly to the
DataLoader
) constructor.
- dataset (
-
split
(split_factor, validation_transform=None, **kwargs)[source]¶ Create a training and validation DataFeed from this one.
Parameters: - split_factor (float) – [0.0, 1.0] the fraction of the dataset that should be put into the new training feed.
- validation_transform (callable) – override the existing validation transform with this.
- **kwargs – passed directly to the
DataLoader
) constructor.
Returns: training_feed, validation_feed
-
sconce.data_feeds.image module¶
sconce.data_feeds.single_class_image module¶
-
class
sconce.data_feeds.single_class_image.
SingleClassImageFeed
(data_loader)[source]¶ Bases:
sconce.data_feeds.image.ImageFeed
An ImageFeed class for use when each image belongs to exactly one class.
-
classmethod
from_image_folder
(root, loader_kwargs=None, **dataset_kwargs)[source]¶ Create a Datafeed from a folder of images. See
torchvision.datasets.ImageFolder
.Parameters: - root (path) – the root directory path.
- loader_kwargs (dict) – keyword args provided to the DataLoader constructor.
- **dataset_kwargs – keyword args provided to the
torchvision.datasets.ImageFolder
constructor.
-
classmethod
from_torchvision
(batch_size=500, data_location=None, dataset_class=<class 'torchvision.datasets.mnist.MNIST'>, fraction=1.0, num_workers=0, pin_memory=True, shuffle=True, train=True, transform=ToTensor())[source]¶ Create a Datafeed from a torchvision dataset class.
Parameters: - batch_size (int) – how large the yielded inputs and targets
should be. See
DataLoader
for details. - data_location (path) – where downloaded dataset should be stored. If
None
a system dependent temporary location will be used. - dataset_class (class) – a torchvision dataset class that supports constructor arguments {‘root’, ‘train’, ‘download’, ‘transform’}. For example, MNIST, FashionMnist, CIFAR10, or CIFAR100.
- fraction (float) – (0.0 - 1.0] how much of the original dataset’s data to use.
- num_workers (int) – how many subprocesses to use for data loading.
See
DataLoader
for details. - pin_memory (bool) – if
True
, the data loader will copy tensors into CUDA pinned memory before returning them. SeeDataLoader
for details. - shuffle (bool) – set to
True
to have the data reshuffled at every epoch. SeeDataLoader
for details. - train (bool) – if
True
, creates dataset from training set, otherwise creates from test set. - transform (callable) – a function/transform that takes in an PIL image and returns a transformed version.
- batch_size (int) – how large the yielded inputs and targets
should be. See
-
classmethod