This module has the necessary functions to be able to download several useful datasets that we might be interested in using in our models.
from fastai.gen_doc.nbdoc import *
from fastai.datasets import *
from fastai.datasets import Config
from pathlib import Path
show_doc(URLs)
class URLs[source][test]
URLs()
No tests found for URLs. To contribute a test please refer to this guide and this discussion.
Global constants for dataset and model URLs.
This contains all the datasets' and models' URLs, and some classmethods to help use them - you don't create objects of this class. The supported datasets are (with their calling name): S3_NLP, S3_COCO, MNIST_SAMPLE, MNIST_TINY, IMDB_SAMPLE, ADULT_SAMPLE, ML_SAMPLE, PLANET_SAMPLE, CIFAR, PETS, MNIST. To get details on the datasets you can see the fast.ai datasets webpage. Datasets with SAMPLE in their name are subsets of the original datasets. In the case of MNIST, we also have a TINY dataset which is even smaller than MNIST_SAMPLE.
Models is now limited to WT103 but you can expect more in the future!
URLs.MNIST_SAMPLE
'http://files.fast.ai/data/examples/mnist_sample'
For the rest of the datasets you will need to download them with untar_data or download_data. untar_data will decompress the data file and download it while download_data will just download and save the compressed file in .tgz format.
By default, data will be downloaded to ~/.fastai/data folder.
Configure the default data_path by editing ~/.fastai/config.yml.
show_doc(untar_data)
untar_data(URLs.PLANET_SAMPLE)
PosixPath('/home/ubuntu/.fastai/data/planet_sample')
show_doc(download_data)
download_data[source][test]
download_data(url:str,fname:PathOrStr=*None,data:bool=True,ext:str='.tgz'*) →Path
No tests found for download_data. To contribute a test please refer to this guide and this discussion.
Download url to destination fname.
Note: If the data file already exists in a data directory inside the notebook, that data file will be used instead of ~/.fasta/data. Paths are resolved by calling the function datapath4file - which checks if data exists locally (data/) first, before downloading to ~/.fastai/data home directory.
Example:
download_data(URLs.PLANET_SAMPLE)
PosixPath('/home/ubuntu/.fastai/data/planet_sample.tgz')
show_doc(datapath4file)
datapath4file[source][test]
datapath4file(filename,ext:str=*'.tgz'*)
No tests found for datapath4file. To contribute a test please refer to this guide and this discussion.
Return data path to filename, checking locally first then in the config file.
All the downloading functions use this to decide where to put the tgz and expanded folder. If filename already exists in a data directory in the same place as the calling notebook/script, that is used as the parent directly, otherwise ~/.fastai/config.yml is read to see what path to use, which defaults to ~/.fastai/data is used. To override this default, simply modify the value in your ~/.fastai/config.yml:
data_path: ~/.fastai/data
show_doc(url2path)
url2path[source][test]
url2path(url,data=*True,ext:str='.tgz'*)
No tests found for url2path. To contribute a test please refer to this guide and this discussion.
Change url to a path.
show_doc(Config)
You probably won't need to use this yourself - it's used by URLs.datapath4file.
show_doc(Config.get_path)
get_path[source][test]
get_path(path)
No tests found for get_path. To contribute a test please refer to this guide and this discussion.
Get the path in the config file.
Get the key corresponding to path in the Config.
show_doc(Config.data_path)
data_path[source][test]
data_path()
No tests found for data_path. To contribute a test please refer to this guide and this discussion.
Get the path to data in the config file.
Get the Path where the data is stored.
show_doc(Config.model_path)
model_path[source][test]
model_path()
No tests found for model_path. To contribute a test please refer to this guide and this discussion.
Get the path to fastai pretrained models in the config file.
show_doc(Config.create)
create[source][test]
create(fpath)
No tests found for create. To contribute a test please refer to this guide and this discussion.
Creates a Config from fpath.
show_doc(url2name)
url2name[source][test]
url2name(url)
No tests found for url2name. To contribute a test please refer to this guide and this discussion.
show_doc(Config.get_key)
get_key[source][test]
get_key(key)
No tests found for get_key. To contribute a test please refer to this guide and this discussion.
Get the path to key in the config file.
show_doc(Config.get)