This module defines the basic DataBunch object that is used inside Learner to train a model. This is the generic class, that can take any kind of fastai Dataset or DataLoader. You'll find helpful functions in the data module of every application to directly create this DataBunch for you.
from fastai.gen_doc.nbdoc import *
from fastai.basic_data import *
show_doc(DataBunch, doc_string=False)
class DataBunch[source]
DataBunch(train_dl:DataLoader,valid_dl:DataLoader,test_dl:Optional[DataLoader]=None,device:device=None,tfms:Optional[Collection[Callable]]=None,path:PathOrStr='.',collate_fn:Callable='data_collate')
Bind together a train_dl, a valid_dl and optionally a test_dl, ensures they are on device and apply to them tfms as batch are drawn. path is used internally to store temporary files, collate_fn is passed to the pytorch Dataloader (replacing the one there) to explain how to collate the samples picked for a batch. By default, it applies data to the object sent (see in vision.image why this can be important).
An example of tfms to pass is normalization. train_dl, valid_dl and optionally test_dl will be wrapped in DeviceDataLoader.
show_doc(DataBunch.create, doc_string=False)
Create a DataBunch from train_ds, valid_ds and optionally test_ds, with batch size bs and by using num_workers. tfms and device are passed to the init method.
show_doc(DataBunch.dl)
dl[source]
dl(ds_type:DatasetType=<DatasetType.Valid: 2>) →DeviceDataLoader
Returns appropriate Dataset for validation, training, or test (ds_type).
show_doc(DataBunch.add_tfm)
add_tfm[source]
add_tfm(tfm:Callable)
Adds a transform to all dataloaders.
show_doc(DeviceDataLoader, doc_string=False)
class DeviceDataLoader[source]
DeviceDataLoader(dl:DataLoader,device:device,tfms:List[Callable]=None,collate_fn:Callable='data_collate')
Put the batches of dl on device after applying an optional list of tfms. collate_fn will replace the one of dl. All dataloaders of a DataBunch are of this type.
show_doc(DeviceDataLoader.create, doc_string=False)
Create a DeviceDataLoader on device from a dataset with batch size bs, num_workersprocesses and a given collate_fn. The dataloader will shuffle the data if that flag is set to True, and tfms are passed to the init method. All kwargs are passed to the pytorch DataLoader class initialization.
show_doc(DeviceDataLoader.one_batch)
show_doc(DeviceDataLoader.add_tfm)
add_tfm[source]
add_tfm(tfm:Callable)
Add a transform (i.e. same as self.tfms.append(tfm)).
show_doc(DeviceDataLoader.remove_tfm)
remove_tfm[source]
remove_tfm(tfm:Callable)
Remove a transform.
show_doc(DatasetBase, title_level=3)
show_doc(LabelDataset, title_level=3)
class LabelDataset[source]
LabelDataset(classes:Collection,class2idx:Dict[Any,int]=None) ::DatasetBase
Base class for fastai datasets that do classification, mapped according to classes.
show_doc(SingleClassificationDataset, title_level=3)
class SingleClassificationDataset[source]
SingleClassificationDataset(classes:StrList) ::DatasetBase
A Dataset that contains no data, only classes, mainly used for inference with set_item
show_doc(DeviceDataLoader.proc_batch)
show_doc(DeviceDataLoader.collate_fn)