This module contains all the basic functions we need in other modules of the fastai library (split with torch_core that contains the ones requiring pytorch). Its documentation can easily be skipped at a first read, unless you want to know what a given function does.
from fastai.gen_doc.nbdoc import *
from fastai.core import *
default_cpus = min(16, num_cpus())
show_doc(has_arg)
has_arg[source][test]
has_arg(func,arg) →bool
No tests found for has_arg. To contribute a test please refer to this guide and this discussion.
Check if func accepts arg.
Examples for two fastai.core functions. Docstring shown before calling has_arg for reference
has_arg(download_url,'url')
True
has_arg(index_row,'x')
False
has_arg(index_row,'a')
True
show_doc(ifnone)
param,alt_param = None,5
ifnone(param,alt_param)
5
param,alt_param = None,[1,2,3]
ifnone(param,alt_param)
[1, 2, 3]
show_doc(is1d)
two_d_array = np.arange(12).reshape(6,2)
print( two_d_array )
print( is1d(two_d_array) )
[[ 0 1] [ 2 3] [ 4 5] [ 6 7] [ 8 9] [10 11]] False
is1d(two_d_array.flatten())
True
show_doc(is_listy)
Check if x is a Collection. Tuple or List qualify
some_data = [1,2,3]
is_listy(some_data)
True
some_data = (1,2,3)
is_listy(some_data)
True
some_data = 1024
print( is_listy(some_data) )
False
print( is_listy( [some_data] ) )
True
some_data = dict([('a',1),('b',2),('c',3)])
print( some_data )
print( some_data.keys() )
{'a': 1, 'b': 2, 'c': 3}
dict_keys(['a', 'b', 'c'])
print( is_listy(some_data) )
print( is_listy(some_data.keys()) )
False False
print( is_listy(list(some_data.keys())) )
True
show_doc(is_tuple)
Check if x is a tuple.
print( is_tuple( [1,2,3] ) )
False
print( is_tuple( (1,2,3) ) )
True
show_doc(arange_of)
arange_of[source][test]
arange_of(x)
No tests found for arange_of. To contribute a test please refer to this guide and this discussion.
Same as range_of but returns an array.
arange_of([5,6,7])
array([0, 1, 2])
type(arange_of([5,6,7]))
numpy.ndarray
show_doc(array)
array[source][test]
array(a,dtype:type=*None, ***kwargs**) →ndarray
Tests found for array:
Some other tests where array is used:
pytest -sv tests/test_core.py::test_arrays_split [source]pytest -sv tests/test_core.py::test_even_mults [source]pytest -sv tests/test_core.py::test_idx_dict [source]pytest -sv tests/test_core.py::test_is1d [source]pytest -sv tests/test_core.py::test_itembase_eq [source]pytest -sv tests/test_core.py::test_itembase_hash [source]pytest -sv tests/test_core.py::test_one_hot [source]pytest -sv tests/test_torch_core.py::test_model_type [source]pytest -sv tests/test_torch_core.py::test_tensor_array_monkey_patch [source]pytest -sv tests/test_torch_core.py::test_tensor_with_ndarray [source]pytest -sv tests/test_torch_core.py::test_to_detach [source]To run tests please refer to this guide.
Same as np.array but also handles generators. kwargs are passed to np.array with dtype.
array([1,2,3])
array([1, 2, 3])
Note that after we call the generator, we do not reset. So the array call has 5 less entries than it would if we ran from the start of the generator.
def data_gen():
i = 100.01
while i<200:
yield i
i += 1.
ex_data_gen = data_gen()
for _ in range(5):
print(next(ex_data_gen))
100.01 101.01 102.01 103.01 104.01
array(ex_data_gen)
array([105.01, 106.01, 107.01, 108.01, ..., 196.01, 197.01, 198.01, 199.01])
ex_data_gen_int = data_gen()
array(ex_data_gen_int,dtype=int) #Cast output to int array
array([100, 101, 102, 103, ..., 196, 197, 198, 199])
show_doc(arrays_split)
data_a = np.arange(15)
data_b = np.arange(15)[::-1]
mask_a = (data_a > 10)
print(data_a)
print(data_b)
print(mask_a)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14] [14 13 12 11 10 9 8 7 6 5 4 3 2 1 0] [False False False False False False False False False False False True True True True]
arrays_split(mask_a,data_a)
[(array([11, 12, 13, 14]),), (array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]),)]
np.vstack([data_a,data_b]).transpose().shape
(15, 2)
arrays_split(mask_a,np.vstack([data_a,data_b]).transpose()) #must match on dimension 0
[(array([[11, 3],
[12, 2],
[13, 1],
[14, 0]]),), (array([[ 0, 14],
[ 1, 13],
[ 2, 12],
[ 3, 11],
[ 4, 10],
[ 5, 9],
[ 6, 8],
[ 7, 7],
[ 8, 6],
[ 9, 5],
[10, 4]]),)]
show_doc(chunks)
You can transform a Collection into an Iterable of 'n' sized chunks by calling chunks:
data = [0,1,2,3,4,5,6,7,8,9]
for chunk in chunks(data, 2):
print(chunk)
[0, 1] [2, 3] [4, 5] [6, 7] [8, 9]
for chunk in chunks(data, 3):
print(chunk)
[0, 1, 2] [3, 4, 5] [6, 7, 8] [9]
show_doc(df_names_to_idx)
ex_df = pd.DataFrame.from_dict({"a":[1,1,1],"b":[2,2,2]})
print(ex_df)
a b 0 1 2 1 1 2 2 1 2
df_names_to_idx('b',ex_df)
[1]
show_doc(extract_kwargs)
extract_kwargs[source][test]
extract_kwargs(names:StrList,kwargs:KWArgs)
No tests found for extract_kwargs. To contribute a test please refer to this guide and this discussion.
Extract the keys in names from the kwargs.
key_word_args = {"a":2,"some_list":[1,2,3],"param":'mean'}
key_word_args
{'a': 2, 'some_list': [1, 2, 3], 'param': 'mean'}
(extracted_val,remainder) = extract_kwargs(['param'],key_word_args)
print( extracted_val,remainder )
{'param': 'mean'} {'a': 2, 'some_list': [1, 2, 3]}
show_doc(idx_dict)
idx_dict(['a','b','c'])
{'a': 0, 'b': 1, 'c': 2}
show_doc(index_row)
index_row[source][test]
index_row(a:Union[Collection[T_co],DataFrame,Series],idxs:Collection[int]) →Any
No tests found for index_row. To contribute a test please refer to this guide and this discussion.
Return the slice of a corresponding to idxs.
a is basically something you can index into like a dataframe, an array or a list.
data = [0,1,2,3,4,5,6,7,8,9]
index_row(data,4)
4
index_row(pd.Series(data),7)
7
data_df = pd.DataFrame([data[::-1],data]).transpose()
data_df
| 0 | 1 | |
|---|---|---|
| 0 | 9 | 0 |
| 1 | 8 | 1 |
| 2 | 7 | 2 |
| 3 | 6 | 3 |
| 4 | 5 | 4 |
| 5 | 4 | 5 |
| 6 | 3 | 6 |
| 7 | 2 | 7 |
| 8 | 1 | 8 |
| 9 | 0 | 9 |
index_row(data_df,7)
0 2 1 7 Name: 7, dtype: int64
show_doc(listify)
to_match = np.arange(12)
listify('a',to_match)
['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']
listify('a',5)
['a', 'a', 'a', 'a', 'a']
listify(77.1,3)
[77.1, 77.1, 77.1]
listify( (1,2,3) )
[1, 2, 3]
listify((1,2,3),('a','b','c'))
[1, 2, 3]
show_doc(random_split)
Splitting is done here with random.uniform() so you may not get the exact split percentage for small data sets
data = np.arange(20).reshape(10,2)
data.tolist()
[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15], [16, 17], [18, 19]]
random_split(0.20,data.tolist())
[(array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15],
[16, 17],
[18, 19]]),), (array([], shape=(0, 2), dtype=int64),)]
random_split(0.20,pd.DataFrame(data))
[(array([[ 0, 1],
[ 4, 5],
[ 8, 9],
[10, 11],
[16, 17],
[18, 19]]),), (array([[ 2, 3],
[ 6, 7],
[12, 13],
[14, 15]]),)]
show_doc(range_of)
range_of[source][test]
range_of(x)
No tests found for range_of. To contribute a test please refer to this guide and this discussion.
Create a range from 0 to len(x).
range_of([5,4,3])
[0, 1, 2]
range_of(np.arange(10)[::-1])
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
show_doc(series2cat)
data_df = pd.DataFrame.from_dict({"a":[1,1,1,2,2,2],"b":['f','e','f','g','g','g']})
data_df
| a | b | |
|---|---|---|
| 0 | 1 | f |
| 1 | 1 | e |
| 2 | 1 | f |
| 3 | 2 | g |
| 4 | 2 | g |
| 5 | 2 | g |
data_df['b']
0 f 1 e 2 f 3 g 4 g 5 g Name: b, dtype: object
series2cat(data_df,'b')
data_df['b']
0 f 1 e 2 f 3 g 4 g 5 g Name: b, dtype: category Categories (3, object): [e < f < g]
series2cat(data_df,'a')
data_df['a']
0 1 1 1 2 1 3 2 4 2 5 2 Name: a, dtype: category Categories (2, int64): [1 < 2]
show_doc(split_kwargs_by_func)
split_kwargs_by_func[source][test]
split_kwargs_by_func(kwargs,func)
No tests found for split_kwargs_by_func. To contribute a test please refer to this guide and this discussion.
Split kwargs between those expected by func and the others.
key_word_args = {'url':'http://fast.ai','dest':'./','new_var':[1,2,3],'testvalue':42}
split_kwargs_by_func(key_word_args,download_url)
({'url': 'http://fast.ai', 'dest': './'},
{'new_var': [1, 2, 3], 'testvalue': 42})
show_doc(to_int)
to_int(3.1415)
3
data = [1.2,3.4,7.25]
to_int(data)
[1, 3, 7]
show_doc(uniqueify)
uniqueify( pd.Series(data=['a','a','b','b','f','g']) )
['a', 'b', 'f', 'g']
show_doc(PrePostInitMeta)
class PrePostInitMeta[source][test]
PrePostInitMeta(name,bases,dct) ::type
No tests found for PrePostInitMeta. To contribute a test please refer to this guide and this discussion.
A metaclass that calls optional __pre_init__ and __post_init__ methods
class _T(metaclass=PrePostInitMeta):
def __pre_init__(self): self.a = 0; assert self.a==0
def __init__(self): self.a += 1; assert self.a==1
def __post_init__(self): self.a += 1; assert self.a==2
t = _T()
t.a
2
show_doc(download_url)
show_doc(find_classes)
show_doc(join_path)
show_doc(join_paths)
show_doc(loadtxt_str)
loadtxt_str[source][test]
loadtxt_str(path:PathOrStr) →ndarray
No tests found for loadtxt_str. To contribute a test please refer to this guide and this discussion.
Return ndarray of str of lines of text from path.
show_doc(save_texts)
save_texts[source][test]
save_texts(fname:PathOrStr,texts:StrList)
No tests found for save_texts. To contribute a test please refer to this guide and this discussion.
Save in fname the content of texts.
show_doc(num_cpus)
show_doc(parallel)
parallel[source][test]
parallel(func,arr:Collection[T_co],max_workers:int=*None,leave=False*)
No tests found for parallel. To contribute a test please refer to this guide and this discussion.
Call func on every element of arr in parallel using max_workers.
func must accept both the value and index of each arr element.
def my_func(value, index):
print("Index: {}, Value: {}".format(index, value))
my_array = [i*2 for i in range(5)]
parallel(my_func, my_array, max_workers=3)
Index: 0, Value: 0 Index: 1, Value: 2 Index: 2, Value: 4 Index: 4, Value: 8 Index: 3, Value: 6
show_doc(partition)
show_doc(partition_by_cores)
partition_by_cores[source][test]
partition_by_cores(a:Collection[T_co],n_cpus:int) →List[Collection[T_co]]
No tests found for partition_by_cores. To contribute a test please refer to this guide and this discussion.
Split data in a equally among n_cpus cores
show_doc(ItemBase, title_level=3)
All items used in fastai should subclass this. Must have a data field that will be used when collating in mini-batches.
show_doc(ItemBase.apply_tfms)
apply_tfms[source][test]
apply_tfms(tfms:Collection[T_co], ****kwargs**)
No tests found for apply_tfms. To contribute a test please refer to this guide and this discussion.
Subclass this method if you want to apply data augmentation with tfms to this ItemBase.
show_doc(ItemBase.show)
show[source][test]
show(ax:Axes, ****kwargs**)
No tests found for show. To contribute a test please refer to this guide and this discussion.
Subclass this method if you want to customize the way this ItemBase is shown on ax.
The default behavior is to set the string representation of this object as title of ax.
show_doc(Category, title_level=3)
show_doc(EmptyLabel, title_level=3)
class EmptyLabel[source][test]
EmptyLabel() ::ItemBase
No tests found for EmptyLabel. To contribute a test please refer to this guide and this discussion.
Should be used for a dummy label.
show_doc(MultiCategory, title_level=3)
Create a MultiCategory with an obj that is a collection of labels. data corresponds to the one-hot encoded labels and raw is a list of associated string.
show_doc(FloatItem)
show_doc(camel2snake)
camel2snake('DeviceDataLoader')
'device_data_loader'
show_doc(even_mults)
In linear scales each element is equidistant from its neighbors:
# from 1 to 10 in 5 steps
t = np.linspace(1, 10, 5)
t
array([ 1. , 3.25, 5.5 , 7.75, 10. ])
for i in range(len(t) - 1):
print(t[i+1] - t[i])
2.25 2.25 2.25 2.25
In logarithmic scales, each element is a multiple of the previous entry:
t = even_mults(1, 10, 5)
t
array([ 1. , 1.778279, 3.162278, 5.623413, 10. ])
# notice how each number is a multiple of its predecessor
for i in range(len(t) - 1):
print(t[i+1] / t[i])
1.7782794100389228 1.7782794100389228 1.7782794100389228 1.7782794100389228
show_doc(func_args)
func_args[source][test]
func_args(func) →bool
No tests found for func_args. To contribute a test please refer to this guide and this discussion.
Return the arguments of func.
func_args(download_url)
('url',
'dest',
'overwrite',
'pbar',
'show_progress',
'chunk_size',
'timeout',
'retries')
Additionally, func_args can be used with functions that do not belong to the fastai library
func_args(np.linspace)
('start', 'stop', 'num', 'endpoint', 'retstep', 'dtype')
show_doc(noop)
Return x.
# object is returned as-is
noop([1,2,3])
[1, 2, 3]
show_doc(one_hot)
One-hot encoding is a standard machine learning technique. Assume we are dealing with a 10-class classification problem and we are supplied a list of labels:
y = [1, 4, 4, 5, 7, 9, 2, 4, 0]
jekyll_note("""y is zero-indexed, therefore its first element (1) belongs to class 2, its second element (4) to class 5 and so on.""")
len(y)
9
y can equivalently be expressed as a matrix of 9 rows and 10 columns, where each row represents one element of the original y.
for label in y:
print(one_hot(label, 10))
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
show_doc(show_some)
show_some[source][test]
show_some(items:Collection[T_co],n_max:int=*5,sep:str=','*)
No tests found for show_some. To contribute a test please refer to this guide and this discussion.
Return the representation of the first n_max elements in items.
# select 3 elements from a list
some_data = show_some([10, 20, 30, 40, 50], 3)
some_data
'10,20,30...'
type(some_data)
str
# the separator can be changed
some_data = show_some([10, 20, 30, 40, 50], 3, sep = '---')
some_data
'10---20---30...'
some_data[:-3]
'10---20---30'
show_some can take as input any class with __len__ and __getitem__
class Any(object):
def __init__(self, data):
self.data = data
def __len__(self):
return len(self.data)
def __getitem__(self,i):
return self.data[i]
some_other_data = Any('nice')
show_some(some_other_data, 2)
'n,i...'
show_doc(subplots)
show_doc(text2html_table)
text2html_table[source][test]
text2html_table(items:Tokens) →str
No tests found for text2html_table. To contribute a test please refer to this guide and this discussion.
Put the texts in items in an HTML table, widths are the widths of the columns in %.
show_doc(is_dict)