19.7. Documento da API d2l
¶ Open the notebook in SageMaker Studio Lab
As implementações dos seguintes membros do pacote d2l
e seções onde
eles são definidos e explicados podem ser encontrados no arquivo
fonte.
- class d2l.mxnet.AdditiveAttention(num_hiddens, dropout, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
Additive attention.
- class d2l.mxnet.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶
Bases:
object
For plotting data in animation.
- class d2l.mxnet.AttentionDecoder(**kwargs)[source]¶
Bases:
d2l.mxnet.Decoder
The base attention-based decoder interface.
- property attention_weights¶
- class d2l.mxnet.BERTEncoder(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.BERTModel(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.CTRDataset(data_path, feat_mapper=None, defaults=None, min_threshold=4, num_feat=34)[source]¶
Bases:
mxnet.gluon.data.dataset.Dataset
- class d2l.mxnet.Decoder(**kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
The base decoder interface for the encoder-decoder architecture.
- class d2l.mxnet.DotProductAttention(dropout, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
Scaled dot product attention.
- class d2l.mxnet.Encoder(**kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
The base encoder interface for the encoder-decoder architecture.
- class d2l.mxnet.EncoderBlock(num_hiddens, ffn_num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.EncoderDecoder(encoder, decoder, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
The base class for the encoder-decoder architecture.
- class d2l.mxnet.HingeLossbRec(weight=None, batch_axis=0, **kwargs)[source]¶
Bases:
mxnet.gluon.loss.Loss
- class d2l.mxnet.MaskedSoftmaxCELoss(axis=- 1, sparse_label=True, from_logits=False, weight=None, batch_axis=0, **kwargs)[source]¶
Bases:
mxnet.gluon.loss.SoftmaxCrossEntropyLoss
The softmax cross-entropy loss with masks.
- class d2l.mxnet.MultiHeadAttention(num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.PositionWiseFFN(ffn_num_hiddens, ffn_num_outputs, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.PositionalEncoding(num_hiddens, dropout, max_len=1000)[source]¶
Bases:
mxnet.gluon.block.Block
- class d2l.mxnet.RNNModel(rnn_layer, vocab_size, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
The RNN model.
- class d2l.mxnet.RNNModelScratch(vocab_size, num_hiddens, device, get_params, init_state, forward_fn)[source]¶
Bases:
object
An RNN Model implemented from scratch.
- class d2l.mxnet.RandomGenerator(sampling_weights)[source]¶
Bases:
object
Draw a random int in [0, n] according to n sampling weights.
- class d2l.mxnet.Residual(num_channels, use_1x1conv=False, strides=1, **kwargs)[source]¶
Bases:
mxnet.gluon.block.Block
The Residual block of ResNet.
- class d2l.mxnet.SNLIDataset(dataset, num_steps, vocab=None)[source]¶
Bases:
mxnet.gluon.data.dataset.Dataset
A customized dataset to load the SNLI dataset.
- class d2l.mxnet.Seq2SeqEncoder(vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs)[source]¶
Bases:
d2l.mxnet.Encoder
The RNN encoder for sequence to sequence learning.
- class d2l.mxnet.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶
Bases:
object
An iterator to load sequence data.
- class d2l.mxnet.TransformerEncoder(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, use_bias=False, **kwargs)[source]¶
Bases:
d2l.mxnet.Encoder
- class d2l.mxnet.VOCSegDataset(is_train, crop_size, voc_dir)[source]¶
Bases:
mxnet.gluon.data.dataset.Dataset
A customized dataset to load VOC dataset.
- filter(imgs)[source]¶
Returns a new dataset with samples filtered by the filter function fn.
Note that if the Dataset is the result of a lazily transformed one with transform(lazy=False), the filter is eagerly applied to the transformed samples without materializing the transformed result. That is, the transformation will be applied again whenever a sample is retrieved after filter().
- fncallable
A filter function that takes a sample as input and returns a boolean. Samples that return False are discarded.
- Dataset
The filtered dataset.
- class d2l.mxnet.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶
Bases:
object
Vocabulary for text.
- d2l.mxnet.abs(x, out=None, **kwargs)¶
Calculate the absolute value element-wise.
- xndarray or scalar
Input array.
- outndarray or None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned.
- absolutendarray
An ndarray containing the absolute value of each element in x. This is a scalar if x is a scalar.
>>> x = np.array([-1.2, 1.2]) >>> np.abs(x) array([1.2, 1.2])
- d2l.mxnet.arange(start, stop=None, step=1, dtype=None, ctx=None)¶
Return evenly spaced values within a given interval.
Values are generated within the half-open interval
[start, stop)
(in other words, the interval including start but excluding stop). For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.- startnumber, optional
Start of interval. The interval includes this value. The default start value is 0.
- stopnumber
End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.
- stepnumber, optional
Spacing between values. For any output out, this is the distance between two adjacent values,
out[i+1] - out[i]
. The default step size is 1. If step is specified as a position argument, start must also be given.- dtypedtype
The type of the output array. The default is float32.
- arangendarray
Array of evenly spaced values.
For floating point arguments, the length of the result is
ceil((stop - start)/step)
. Because of floating point overflow, this rule may result in the last element of out being greater than stop.
>>> np.arange(3) array([0., 1., 2.])
>>> np.arange(3.0) array([0., 1., 2.])
>>> np.arange(3,7) array([3., 4., 5., 6.])
>>> np.arange(3,7,2) array([3., 5.])
- d2l.mxnet.argmax(x, *args, **kwargs)¶
- d2l.mxnet.astype(x, *args, **kwargs)¶
- d2l.mxnet.box_center_to_corner(boxes)[source]¶
Convert from (center, width, height) to (upper_left, bottom_right)
- d2l.mxnet.box_corner_to_center(boxes)[source]¶
Convert from (upper_left, bottom_right) to (center, width, height)
- d2l.mxnet.box_iou(boxes1, boxes2)[source]¶
Compute IOU between two sets of boxes of shape (N,4) and (M,4).
- d2l.mxnet.build_array_nmt(lines, vocab, num_steps)[source]¶
Transform text sequences of machine translation into minibatches.
- d2l.mxnet.concat(seq, axis=0, out=None)¶
Join a sequence of arrays along an existing axis.
- a1, a2, …sequence of array_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
- axisint, optional
The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0.
- outndarray, optional
If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.
- resndarray
The concatenated array.
split : Split array into a list of multiple sub-arrays of equal size. hsplit : Split array into multiple sub-arrays horizontally (column wise) vsplit : Split array into multiple sub-arrays vertically (row wise) dsplit : Split array into multiple sub-arrays along the 3rd axis (depth). stack : Stack a sequence of arrays along a new axis. hstack : Stack arrays in sequence horizontally (column wise) vstack : Stack arrays in sequence vertically (row wise) dstack : Stack arrays in sequence depth wise (along third dimension)
>>> a = np.array([[1, 2], [3, 4]]) >>> b = np.array([[5, 6]]) >>> np.concatenate((a, b), axis=0) array([[1., 2.], [3., 4.], [5., 6.]])
>>> np.concatenate((a, b.T), axis=1) array([[1., 2., 5.], [3., 4., 6.]])
>>> np.concatenate((a, b), axis=None) array([1., 2., 3., 4., 5., 6.])
- d2l.mxnet.cos(x, out=None, **kwargs)¶
Cosine, element-wise.
- xndarray or scalar
Angle, in radians (\(2 \pi\) rad equals 360 degrees).
- outndarray or None
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.
- yndarray or scalar
The corresponding cosine values. This is a scalar if x is a scalar.
This function only supports input type of float.
>>> np.cos(np.array([0, np.pi/2, np.pi])) array([ 1.000000e+00, -4.371139e-08, -1.000000e+00]) >>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='f') >>> out2 = np.cos(np.array([0.1]), out1) >>> out2 is out1 True
- d2l.mxnet.cosh(x, out=None, **kwargs)¶
Hyperbolic cosine, element-wise. Equivalent to
1/2 * (np.exp(x) + np.exp(-x))
andnp.cos(1j*x)
.- xndarray or scalar
Input array or scalar.
- outndarray or None
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.
- yndarray or scalar
The corresponding hyperbolic cosine values. This is a scalar if x is a scalar.
This function only supports input type of float.
>>> np.cosh(0) 1.0
- class d2l.mxnet.defaultdict¶
Bases:
dict
defaultdict(default_factory[, …]) –> dict with default factory
The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.
- copy() a shallow copy of D. ¶
- default_factory¶
Factory for default value called by __missing__().
- d2l.mxnet.download(name, cache_dir='../data')[source]¶
Download a file inserted into DATA_HUB, return the local filename.
- d2l.mxnet.evaluate_accuracy_gpu(net, data_iter, device=None)[source]¶
Compute the accuracy for a model on a dataset using a GPU.
- d2l.mxnet.evaluate_loss(net, data_iter, loss)[source]¶
Evaluate the loss of a model on the given dataset.
- d2l.mxnet.evaluate_ranking(net, test_input, seq, candidates, num_users, num_items, devices)[source]¶
- d2l.mxnet.exp(x, out=None, **kwargs)¶
Calculate the exponential of all elements in the input array.
- xndarray or scalar
Input values.
- outndarray or None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned.
- outndarray or scalar
Output array, element-wise exponential of x. This is a scalar if x is a scalar.
>>> np.exp(1) 2.718281828459045 >>> x = np.array([-1, 1, -2, 2]) >>> np.exp(x) array([0.36787945, 2.7182817 , 0.13533528, 7.389056 ])
- d2l.mxnet.eye(N, M=None, k=0, dtype=<class 'numpy.float32'>, **kwargs)¶
Return a 2-D array with ones on the diagonal and zeros elsewhere.
- Nint
Number of rows in the output.
- Mint, optional
Number of columns in the output. If None, defaults to N.
- kint, optional
Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.
- dtypedata-type, optional
Data-type of the returned array.
- Indarray of shape (N,M)
An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.
>>> np.eye(2, dtype=int) array([[1, 0], [0, 1]], dtype=int64) >>> np.eye(3, k=1) array([[0., 1., 0.], [0., 0., 1.], [0., 0., 0.]])
- class d2l.mxnet.float32¶
Bases:
numpy.floating
Single-precision floating-point number type, compatible with C
float
. Character code:'f'
. Canonical name:np.single
. Alias on this platform:np.float32
: 32-bit-precision floating-point number type: sign bit, 8 bits exponent, 23 bits mantissa.- as_integer_ratio()¶
Return a pair of integers, whose ratio is exactly equal to the original floating point number, and with a positive denominator. Raise OverflowError on infinities and a ValueError on NaNs.
>>> np.single(10.0).as_integer_ratio() (10, 1) >>> np.single(0.0).as_integer_ratio() (0, 1) >>> np.single(-.25).as_integer_ratio() (-1, 4)
- d2l.mxnet.get_fashion_mnist_labels(labels)[source]¶
Return text labels for the Fashion-MNIST dataset.
- class d2l.mxnet.int32¶
Bases:
numpy.signedinteger
Signed integer type, compatible with C
int
. Character code:'i'
. Canonical name:np.intc
. Alias on this platform:np.int32
: 32-bit signed integer (-2147483648 to 2147483647).
- d2l.mxnet.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0, ctx=None)¶
Return evenly spaced numbers over a specified interval.
Returns num evenly spaced samples, calculated over the interval [start, stop]. The endpoint of the interval can optionally be excluded.
- startreal number
The starting value of the sequence.
- stopreal number
The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.
- numint, optional
Number of samples to generate. Default is 50. Must be non-negative.
- endpointbool, optional
If True, stop is the last sample. Otherwise, it is not included. Default is True.
- retstepbool, optional
If True, return (samples, step), where step is the spacing between samples.
- dtypedtype, optional
The type of the output array. If dtype is not given, infer the data type from the other input arguments.
- axisint, optional
The axis in the result to store the samples. Relevant only if start or stop are array-like. By default (0), the samples will be along a new axis inserted at the beginning. Use -1 to get an axis at the end.
- samplesndarray
There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).
- stepfloat, optional
Only returned if retstep is True Size of spacing between samples.
- arangeSimilar to linspace, but uses a step size (instead of the
number of samples).
>>> np.linspace(2.0, 3.0, num=5) array([2. , 2.25, 2.5 , 2.75, 3. ]) >>> np.linspace(2.0, 3.0, num=5, endpoint=False) array([2. , 2.2, 2.4, 2.6, 2.8]) >>> np.linspace(2.0, 3.0, num=5, retstep=True) (array([2. , 2.25, 2.5 , 2.75, 3. ]), 0.25)
Graphical illustration:
>>> import matplotlib.pyplot as plt >>> N = 8 >>> y = np.zeros(N) >>> x1 = np.linspace(0, 10, N, endpoint=True) >>> x2 = np.linspace(0, 10, N, endpoint=False) >>> plt.plot(x1.asnumpy(), y.asnumpy(), 'o') [<matplotlib.lines.Line2D object at 0x...>] >>> plt.plot(x2.asnumpy(), (y + 0.5).asnumpy(), 'o') [<matplotlib.lines.Line2D object at 0x...>] >>> plt.ylim([-0.5, 1]) (-0.5, 1) >>> plt.show()
This function differs from the original numpy.linspace in the following aspects:
start and stop do not support list, numpy ndarray and mxnet ndarray
axis could only be 0
There could be an additional ctx argument to specify the device, e.g. the i-th GPU.
- d2l.mxnet.load_array(data_arrays, batch_size, is_train=True)[source]¶
Construct a Gluon data iterator.
- d2l.mxnet.load_corpus_time_machine(max_tokens=- 1)[source]¶
Return token indices and the vocabulary of the time machine dataset.
- d2l.mxnet.load_data_fashion_mnist(batch_size, resize=None)[source]¶
Download the Fashion-MNIST dataset and then load it into memory.
- d2l.mxnet.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶
Return the iterator and the vocabularies of the translation dataset.
- d2l.mxnet.load_data_snli(batch_size, num_steps=50)[source]¶
Download the SNLI dataset and return data iterators and vocabulary.
- d2l.mxnet.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶
Return the iterator and the vocabulary of the time machine dataset.
- d2l.mxnet.load_data_voc(batch_size, crop_size)[source]¶
Download and load the VOC2012 semantic dataset.
- d2l.mxnet.log(x, out=None, **kwargs)¶
Natural logarithm, element-wise. The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.
- xndarray
Input value. Elements must be of real value.
- outndarray or None, optional
A location into which the result is stored. If provided, it must have the same shape and dtype as input ndarray. If not provided or None, a freshly-allocated array is returned.
- yndarray
The natural logarithm of x, element-wise. This is a scalar if x is a scalar.
Currently only supports data of real values and
inf
as input. Returns data of real value,inf
,-inf
andnan
according to the input. This function differs from the original numpy.log in the following aspects: - Does not support complex number for now - Input type does not support Python native iterables(list, tuple, …). -out
param: cannot perform auto broadcasting.out
ndarray’s shape must be the same as the expected output. -out
param: cannot perform auto type cast.out
ndarray’s dtype must be the same as the expected output. -out
param does not support scalar input case.>>> a = np.array([1, np.exp(1), np.exp(2), 0], dtype=np.float64) >>> np.log(a) array([ 0., 1., 2., -inf], dtype=float64) >>> # Using the default float32 dtype leads to slightly different behavior >>> a = np.array([1, np.exp(1), np.exp(2), 0]) >>> np.log(a) array([ 0., 0.99999994, 2., -inf]) >>> np.log(1) 0.0
- d2l.mxnet.masked_softmax(X, valid_lens)[source]¶
Perform softmax operation by masking elements on the last axis.
- d2l.mxnet.match_anchor_to_bbox(ground_truth, anchors, device, iou_threshold=0.5)[source]¶
Assign ground-truth bounding boxes to anchor boxes similar to them.
- d2l.mxnet.matmul(a, b, out=None)¶
Dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors
If both a and b are 2-D arrays, it is matrix multiplication,
If either a or b is 0-D (scalar), it is equivalent to
multiply()
and usingnp.multiply(a, b)
ora * b
is preferred.If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is a 2-D array, it is a sum product over the last axis of a and the second-to-last axis of b:
dot(a, b)[i,j,k] = sum(a[i,j,:] * b[:,k])
- andarray
First argument.
- bndarray
Second argument.
- outndarray, optional
Output argument. It must have the same shape and type as the expected output.
- outputndarray
Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned
>>> a = np.array(3) >>> b = np.array(4) >>> np.dot(a, b) array(12.)
For 2-D arrays it is the matrix product:
>>> a = np.array([[1, 0], [0, 1]]) >>> b = np.array([[4, 1], [2, 2]]) >>> np.dot(a, b) array([[4., 1.], [2., 2.]])
>>> a = np.arange(3*4*5*6).reshape((3,4,5,6)) >>> b = np.arange(5*6)[::-1].reshape((6,5)) >>> np.dot(a, b)[2,3,2,2] array(29884.) >>> np.sum(a[2,3,2,:] * b[:,2]) array(29884.)
- d2l.mxnet.meshgrid(*xi, **kwargs)[source]¶
Return coordinate matrices from coordinate vectors.
Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.
- x1, x2,…, xnndarrays
1-D arrays representing the coordinates of a grid.
- indexing{‘xy’, ‘ij’}, optional
Cartesian (‘xy’, default) or matrix (‘ij’) indexing of output. See Notes for more details.
- sparsebool, optional
If True a sparse grid is returned in order to conserve memory. Default is False. Please note that sparse=True is currently not supported.
- copybool, optional
If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that copy=False is currently not supported.
- X1, X2,…, XNndarray
For vectors x1, x2,…, ‘xn’ with lengths
Ni=len(xi)
, return(N1, N2, N3,...Nn)
shaped arrays if indexing=’ij’ or(N2, N1, N3,...Nn)
shaped arrays if indexing=’xy’ with the elements of xi repeated to fill the matrix along the first dimension for x1, the second for x2 and so on.
This function supports both indexing conventions through the indexing keyword argument. Giving the string ‘ij’ returns a meshgrid with matrix indexing, while ‘xy’ returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for ‘xy’ indexing and (M, N) for ‘ij’ indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for ‘xy’ indexing and (M, N, P) for ‘ij’ indexing. The difference is illustrated by the following code snippet:
xv, yv = np.meshgrid(x, y, sparse=False, indexing='ij') for i in range(nx): for j in range(ny): # treat xv[i,j], yv[i,j] xv, yv = np.meshgrid(x, y, sparse=False, indexing='xy') for i in range(nx): for j in range(ny): # treat xv[j,i], yv[j,i]
In the 1-D and 0-D case, the indexing and sparse keywords have no effect.
- d2l.mxnet.multibox_detection(cls_probs, offset_preds, anchors, nms_threshold=0.5, pos_threshold=0.00999999978)[source]¶
- d2l.mxnet.normal(loc=0.0, scale=1.0, size=None, dtype=None, ctx=None, out=None)[source]¶
Draw random samples from a normal (Gaussian) distribution.
Samples are distributed according to a normal distribution parametrized by loc (mean) and scale (standard deviation).
- locfloat, optional
Mean (centre) of the distribution.
- scalefloat, optional
Standard deviation (spread or “width”) of the distribution.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a scalar tensor containing a single value is returned if loc and scale are both scalars. Otherwise,
np.broadcast(low, high).size
samples are drawn.- dtype{‘float16’, ‘float32’, ‘float64’}, optional
Data type of output samples. Default is ‘float32’
- ctxContext, optional
Device context of output, default is current context.
- out
ndarray
, optional Store output to an existing
ndarray
.
- outndarray
Drawn samples from the parameterized normal distribution.
The probability density for the Gaussian distribution is
(19.7.1)¶\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x - \sigma\) 2). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.
- 1
Wikipedia, “Normal distribution”, https://en.wikipedia.org/wiki/Normal_distribution
- 2
P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.
>>> mu, sigma = 0, 0.1 # mean and standard deviation >>> s = np.random.normal(mu, sigma, 1000)
Verify the mean and the variance:
>>> np.abs(mu - np.mean(s)) < 0.01 array(True)
- d2l.mxnet.numpy(x, *args, **kwargs)¶
- d2l.mxnet.ones(shape, dtype=<class 'numpy.float32'>, order='C', ctx=None)¶
Return a new array of given shape and type, filled with ones. This function currently only supports storing multi-dimensional data in row-major (C-style).
- shapeint or tuple of int
The shape of the empty array.
- dtypestr or numpy.dtype, optional
An optional value type. Default is numpy.float32. Note that this behavior is different from NumPy’s ones function where float64 is the default value, because float32 is considered as the default data type in deep learning.
- order{‘C’}, optional, default: ‘C’
How to store multi-dimensional data in memory, currently only row-major (C-style) is supported.
- ctxContext, optional
An optional device context (default is the current default context).
- outndarray
Array of ones with the given shape, dtype, and ctx.
>>> np.ones(5) array([1., 1., 1., 1., 1.])
>>> np.ones((5,), dtype=int) array([1, 1, 1, 1, 1], dtype=int64)
>>> np.ones((2, 1)) array([[1.], [1.]])
>>> s = (2,2) >>> np.ones(s) array([[1., 1.], [1., 1.]])
- d2l.mxnet.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶
Plot data points.
- d2l.mxnet.predict_ch8(prefix, num_preds, net, vocab, device)[source]¶
Generate new characters following the prefix.
- d2l.mxnet.predict_seq2seq(net, src_sentence, src_vocab, tgt_vocab, num_steps, device, save_attention_weights=False)[source]¶
Predict for sequence to sequence.
- d2l.mxnet.rand(*size, **kwargs)[source]¶
Random values in a given shape.
Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1). Parameters ———- d0, d1, …, dn : int, optional
The dimensions of the returned array, should be all positive. If no argument is given a single Python float is returned.
- outndarray
Random values.
>>> np.random.rand(3,2) array([[ 0.14022471, 0.96360618], #random [ 0.37601032, 0.25528411], #random [ 0.49313049, 0.94909878]]) #random
- d2l.mxnet.read_snli(data_dir, is_train)[source]¶
Read the SNLI dataset into premises, hypotheses, and labels.
- d2l.mxnet.reduce_sum(x, *args, **kwargs)¶
- d2l.mxnet.reshape(x, *args, **kwargs)¶
- d2l.mxnet.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using random sampling.
- d2l.mxnet.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using sequential partitioning.
- d2l.mxnet.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶
Set the axes for matplotlib.
- d2l.mxnet.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶
- d2l.mxnet.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶
Plot a list of images.
- d2l.mxnet.sin(x, out=None, **kwargs)¶
Trigonometric sine, element-wise.
- xndarray or scalar
Angle, in radians (\(2 \pi\) rad equals 360 degrees).
- outndarray or None
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.
- yndarray or scalar
The sine of each element of x. This is a scalar if x is a scalar.
This function only supports input type of float.
>>> np.sin(np.pi/2.) 1.0 >>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180.) array([0. , 0.5 , 0.70710677, 0.86602545, 1. ])
- d2l.mxnet.sinh(x, out=None, **kwargs)¶
Hyperbolic sine, element-wise. Equivalent to
1/2 * (np.exp(x) - np.exp(-x))
or-1j * np.sin(1j*x)
.- xndarray or scalar
Input array or scalar.
- outndarray or None
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.
- yndarray or scalar
The corresponding hyperbolic sine values. This is a scalar if x is a scalar.
This function only supports input type of float.
>>> np.sinh(0) 0.0 >>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='f') >>> out2 = np.sinh(np.array([0.1]), out1) >>> out2 is out1 True
- d2l.mxnet.size(a)¶
- d2l.mxnet.split_and_load_ml100k(split_mode='seq-aware', feedback='explicit', test_ratio=0.1, batch_size=256)[source]¶
- d2l.mxnet.split_batch_multi_inputs(X, y, devices)[source]¶
Split multi-input X and y into multiple devices.
- d2l.mxnet.split_data_ml100k(data, num_users, num_items, split_mode='random', test_ratio=0.1)[source]¶
Split the dataset in random mode or seq-aware mode.
- d2l.mxnet.stack(arrays, axis=0, out=None)¶
- Join a sequence of arrays along a new axis.
The axis parameter specifies the index of the new axis in the dimensions of the result. For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension.
- arrayssequence of array_like
Each array must have the same shape.
- axisint, optional
The axis in the result array along which the input arrays are stacked.
- outndarray, optional
If provided, the destination to place the result. The shape must be correct, matching that of what stack would have returned if no out argument were specified.
- stackedndarray
The stacked array has one more dimension than the input arrays.
concatenate : Join a sequence of arrays along an existing axis. split : Split array into a list of multiple sub-arrays of equal size.
>>> arrays = [np.random.rand(3, 4) for _ in range(10)] >>> np.stack(arrays, axis=0).shape (10, 3, 4)
>>> np.stack(arrays, axis=1).shape (3, 10, 4)
>>> np.stack(arrays, axis=2).shape (3, 4, 10)
>>> a = np.array([1, 2, 3]) >>> b = np.array([2, 3, 4]) >>> np.stack((a, b)) array([[1., 2., 3.], [2., 3., 4.]])
>>> np.stack((a, b), axis=-1) array([[1., 2.], [2., 3.], [3., 4.]])
- d2l.mxnet.tanh(x, out=None, **kwargs)¶
Compute hyperbolic tangent element-wise. Equivalent to
np.sinh(x)/np.cosh(x)
.- xndarray or scalar.
Input array.
- outndarray or None
A location into which the result is stored. If provided, it must have a shape that the inputs fill into. If not provided or None, a freshly-allocated array is returned. The dtype of the output and input must be the same.
- yndarray or scalar
The corresponding hyperbolic tangent values.
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples) - input x does not support complex computation (like imaginary number) >>> np.tanh(np.pi*1j) TypeError: type <type ‘complex’> not supported
>>> np.tanh(np.array[0, np.pi])) array([0. , 0.9962721]) >>> np.tanh(np.pi) 0.99627207622075 >>> # Example of providing the optional output parameter illustrating >>> # that what is returned is a reference to said parameter >>> out1 = np.array(1) >>> out2 = np.tanh(np.array(0.1), out1) >>> out2 is out1 True
- d2l.mxnet.tensor(object, dtype=None, ctx=None)¶
Create an array.
- objectarray_like or numpy.ndarray or mxnet.numpy.ndarray
An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.
- dtypedata-type, optional
The desired data-type for the array. Default is float32.
- ctxdevice context, optional
Device context on which the memory is allocated. Default is mxnet.context.current_context().
- outndarray
An array object satisfying the specified requirements.
>>> np.array([1, 2, 3]) array([1., 2., 3.])
>>> np.array([[1, 2], [3, 4]]) array([[1., 2.], [3., 4.]])
>>> np.array([[1, 0], [0, 1]], dtype=bool) array([[ True, False], [False, True]])
- d2l.mxnet.to(x, *args, **kwargs)¶
- d2l.mxnet.train_2d(trainer, steps=20)[source]¶
Optimize a 2-dim objective function with a customized trainer.
- d2l.mxnet.train_batch_ch13(net, features, labels, loss, trainer, devices, split_f=<function split_batch>)[source]¶
- d2l.mxnet.train_ch11(trainer_fn, states, hyperparams, data_iter, feature_dim, num_epochs=2)[source]¶
- d2l.mxnet.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[gpu(0), gpu(1), gpu(2), gpu(3)], split_f=<function split_batch>)[source]¶
- d2l.mxnet.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)[source]¶
Train a model (defined in Chapter 3).
- d2l.mxnet.train_ch6(net, train_iter, test_iter, num_epochs, lr, device)[source]¶
Train a model with a GPU (defined in Chapter 6).
- d2l.mxnet.train_ch8(net, train_iter, vocab, lr, num_epochs, device, use_random_iter=False)[source]¶
Train a model (defined in Chapter 8).
- d2l.mxnet.train_epoch_ch3(net, train_iter, loss, updater)[source]¶
Train a model within one epoch (defined in Chapter 3).
- d2l.mxnet.train_epoch_ch8(net, train_iter, loss, updater, device, use_random_iter)[source]¶
Train a model within one epoch (defined in Chapter 8).
- d2l.mxnet.train_ranking(net, train_iter, test_iter, loss, trainer, test_seq_iter, num_users, num_items, num_epochs, devices, evaluator, candidates, eval_step=1)[source]¶
- d2l.mxnet.train_recsys_rating(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[gpu(0), gpu(1), gpu(2), gpu(3)], evaluator=None, **kwargs)[source]¶
- d2l.mxnet.train_seq2seq(net, data_iter, lr, num_epochs, tgt_vocab, device)[source]¶
Train a model for sequence to sequence.
- d2l.mxnet.transpose(a)¶
- d2l.mxnet.voc_rand_crop(feature, label, height, width)[source]¶
Randomly crop for both feature and label images.
- d2l.mxnet.zeros(shape, dtype=None, order='C', ctx=None)¶
Return a new array of given shape and type, filled with zeros. This function currently only supports storing multi-dimensional data in row-major (C-style).
- shapeint or tuple of int
The shape of the empty array.
- dtypestr or numpy.dtype, optional
An optional value type (default is numpy.float32). Note that this behavior is different from NumPy’s zeros function where float64 is the default value, because float32 is considered as the default data type in deep learning.
- order{‘C’}, optional, default: ‘C’
How to store multi-dimensional data in memory, currently only row-major (C-style) is supported.
- ctxContext, optional
An optional device context (default is the current default context).
- outndarray
Array of zeros with the given shape, dtype, and ctx.
>>> np.zeros(5) array([0., 0., 0., 0., 0.])
>>> np.zeros((5,), dtype=int) array([0, 0, 0, 0, 0], dtype=int64)
>>> np.zeros((2, 1)) array([[0.], [0.]])
- class d2l.torch.AddNorm(normalized_shape, dropout, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X, Y)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.AdditiveAttention(key_size, query_size, num_hiddens, dropout, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(queries, keys, values, valid_lens)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶
Bases:
object
For plotting data in animation.
- class d2l.torch.AttentionDecoder(**kwargs)[source]¶
Bases:
d2l.torch.Decoder
The base attention-based decoder interface.
- property attention_weights¶
- training: bool¶
- class d2l.torch.BERTEncoder(vocab_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, key_size=768, query_size=768, value_size=768, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(tokens, segments, valid_lens)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.BERTModel(vocab_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, key_size=768, query_size=768, value_size=768, hid_in_features=768, mlm_in_features=768, nsp_in_features=768)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(tokens, segments, valid_lens=None, pred_positions=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.Decoder(**kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
The base decoder interface for the encoder-decoder architecture.
- forward(X, state)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.DotProductAttention(dropout, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
Scaled dot product attention.
- forward(queries, keys, values, valid_lens=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.Encoder(**kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
The base encoder interface for the encoder-decoder architecture.
- forward(X, *args)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.EncoderBlock(key_size, query_size, value_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X, valid_lens)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.EncoderDecoder(encoder, decoder, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
The base class for the encoder-decoder architecture.
- forward(enc_X, dec_X, *args)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.MaskLM(vocab_size, num_hiddens, num_inputs=768, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X, pred_positions)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.MaskedSoftmaxCELoss(weight: Optional[torch.Tensor] = None, size_average=None, ignore_index: int = - 100, reduce=None, reduction: str = 'mean')[source]¶
Bases:
torch.nn.modules.loss.CrossEntropyLoss
The softmax cross-entropy loss with masks.
- forward(pred, label, valid_len)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- ignore_index: int¶
- class d2l.torch.MultiHeadAttention(key_size, query_size, value_size, num_hiddens, num_heads, dropout, bias=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(queries, keys, values, valid_lens)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.NextSentencePred(num_inputs, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.PositionWiseFFN(ffn_num_input, ffn_num_hiddens, ffn_num_outputs, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.PositionalEncoding(num_hiddens, dropout, max_len=1000)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(X)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.RNNModel(rnn_layer, vocab_size, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
The RNN model.
- forward(inputs, state)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.RNNModelScratch(vocab_size, num_hiddens, device, get_params, init_state, forward_fn)[source]¶
Bases:
object
A RNN Model implemented from scratch.
- class d2l.torch.RandomGenerator(sampling_weights)[source]¶
Bases:
object
Draw a random int in [0, n] according to n sampling weights.
- class d2l.torch.Residual(input_channels, num_channels, use_1x1conv=False, strides=1)[source]¶
Bases:
torch.nn.modules.module.Module
The Residual block of ResNet.
- forward(X)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.SNLIDataset(dataset, num_steps, vocab=None)[source]¶
Bases:
torch.utils.data.dataset.Dataset
A customized dataset to load the SNLI dataset.
- class d2l.torch.Seq2SeqEncoder(vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs)[source]¶
Bases:
d2l.torch.Encoder
The RNN encoder for sequence to sequence learning.
- forward(X, *args)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶
Bases:
object
An iterator to load sequence data.
- class d2l.torch.TransformerEncoder(vocab_size, key_size, query_size, value_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, use_bias=False, **kwargs)[source]¶
Bases:
d2l.torch.Encoder
- forward(X, valid_lens, *args)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class d2l.torch.VOCSegDataset(is_train, crop_size, voc_dir)[source]¶
Bases:
torch.utils.data.dataset.Dataset
A customized dataset to load VOC dataset.
- class d2l.torch.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶
Bases:
object
Vocabulary for text.
- d2l.torch.abs(input, *, out=None) Tensor ¶
Computes the absolute value of each element in
input
.(19.7.2)¶\[\text{out}_{i} = |\text{input}_{i}|\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> torch.abs(torch.tensor([-1, -2, 3])) tensor([ 1, 2, 3])
- d2l.torch.arange(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Returns a 1-D tensor of size \(\left\lceil \frac{\text{end} - \text{start}}{\text{step}} \right\rceil\) with values from the interval
[start, end)
taken with common differencestep
beginning from start.Note that non-integer
step
is subject to floating point rounding errors when comparing againstend
; to avoid inconsistency, we advise adding a small epsilon toend
in such cases.(19.7.3)¶\[\text{out}_{{i+1}} = \text{out}_{i} + \text{step}\]- Args:
start (Number): the starting value for the set of points. Default:
0
. end (Number): the ending value for the set of points step (Number): the gap between each pair of adjacent points. Default:1
.- Keyword args:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
). If dtype is not given, infer the data type from the other input arguments. If any of start, end, or stop are floating-point, the dtype is inferred to be the default dtype, seeget_default_dtype()
. Otherwise, the dtype is inferred to be torch.int64.- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
Example:
>>> torch.arange(5) tensor([ 0, 1, 2, 3, 4]) >>> torch.arange(1, 4) tensor([ 1, 2, 3]) >>> torch.arange(1, 2.5, 0.5) tensor([ 1.0000, 1.5000, 2.0000])
- d2l.torch.argmax(x, *args, **kwargs)¶
- d2l.torch.astype(x, *args, **kwargs)¶
- d2l.torch.box_center_to_corner(boxes)[source]¶
Convert from (center, width, height) to (upper_left, bottom_right)
- d2l.torch.box_corner_to_center(boxes)[source]¶
Convert from (upper_left, bottom_right) to (center, width, height)
- d2l.torch.box_iou(boxes1, boxes2)[source]¶
Compute IOU between two sets of boxes of shape (N,4) and (M,4).
- d2l.torch.build_array_nmt(lines, vocab, num_steps)[source]¶
Transform text sequences of machine translation into minibatches.
- d2l.torch.concat()¶
cat(tensors, dim=0, *, out=None) -> Tensor
Concatenates the given sequence of
seq
tensors in the given dimension. All tensors must either have the same shape (except in the concatenating dimension) or be empty.torch.cat()
can be seen as an inverse operation fortorch.split()
andtorch.chunk()
.torch.cat()
can be best understood via examples.- Args:
- tensors (sequence of Tensors): any python sequence of tensors of the same type.
Non-empty tensors provided must have the same shape, except in the cat dimension.
dim (int, optional): the dimension over which the tensors are concatenated
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> x = torch.randn(2, 3) >>> x tensor([[ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497]]) >>> torch.cat((x, x, x), 0) tensor([[ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497], [ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497], [ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497]]) >>> torch.cat((x, x, x), 1) tensor([[ 0.6580, -1.0969, -0.4614, 0.6580, -1.0969, -0.4614, 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497, -0.1034, -0.5790, 0.1497, -0.1034, -0.5790, 0.1497]])
- d2l.torch.cos(input, *, out=None) Tensor ¶
Returns a new tensor with the cosine of the elements of
input
.(19.7.4)¶\[\text{out}_{i} = \cos(\text{input}_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 1.4309, 1.2706, -0.8562, 0.9796]) >>> torch.cos(a) tensor([ 0.1395, 0.2957, 0.6553, 0.5574])
- d2l.torch.cosh(input, *, out=None) Tensor ¶
Returns a new tensor with the hyperbolic cosine of the elements of
input
.(19.7.5)¶\[\text{out}_{i} = \cosh(\text{input}_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.1632, 1.1835, -0.6979, -0.7325]) >>> torch.cosh(a) tensor([ 1.0133, 1.7860, 1.2536, 1.2805])
Note
When
input
is on the CPU, the implementation of torch.cosh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.
- class d2l.torch.defaultdict¶
Bases:
dict
defaultdict(default_factory[, …]) –> dict with default factory
The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.
- copy() a shallow copy of D. ¶
- default_factory¶
Factory for default value called by __missing__().
- d2l.torch.download(name, cache_dir='../data')[source]¶
Download a file inserted into DATA_HUB, return the local filename.
- d2l.torch.evaluate_accuracy_gpu(net, data_iter, device=None)[source]¶
Compute the accuracy for a model on a dataset using a GPU.
- d2l.torch.evaluate_loss(net, data_iter, loss)[source]¶
Evaluate the loss of a model on the given dataset.
- d2l.torch.exp(input, *, out=None) Tensor ¶
Returns a new tensor with the exponential of the elements of the input tensor
input
.(19.7.6)¶\[y_{i} = e^{x_{i}}\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> torch.exp(torch.tensor([0, math.log(2.)])) tensor([ 1., 2.])
- d2l.torch.eye(n, m=None, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Returns a 2-D tensor with ones on the diagonal and zeros elsewhere.
- Args:
n (int): the number of rows m (int, optional): the number of columns with default being
n
- Keyword arguments:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
- Returns:
Tensor: A 2-D tensor with ones on the diagonal and zeros elsewhere
Example:
>>> torch.eye(3) tensor([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]])
- d2l.torch.get_fashion_mnist_labels(labels)[source]¶
Return text labels for the Fashion-MNIST dataset.
- d2l.torch.linspace(start, end, steps, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Creates a one-dimensional tensor of size
steps
whose values are evenly spaced fromstart
toend
, inclusive. That is, the value are:(19.7.7)¶\[(\text{start}, \text{start} + \frac{\text{end} - \text{start}}{\text{steps} - 1}, \ldots, \text{start} + (\text{steps} - 2) * \frac{\text{end} - \text{start}}{\text{steps} - 1}, \text{end})\]Warning
Not providing a value for
steps
is deprecated. For backwards compatibility, not providing a value forsteps
will create a tensor with 100 elements. Note that this behavior is not reflected in the documented function signature and should not be relied on. In a future PyTorch release, failing to provide a value forsteps
will throw a runtime error.- Args:
start (float): the starting value for the set of points end (float): the ending value for the set of points steps (int): size of the constructed tensor
- Keyword arguments:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
Example:
>>> torch.linspace(3, 10, steps=5) tensor([ 3.0000, 4.7500, 6.5000, 8.2500, 10.0000]) >>> torch.linspace(-10, 10, steps=5) tensor([-10., -5., 0., 5., 10.]) >>> torch.linspace(start=-10, end=10, steps=5) tensor([-10., -5., 0., 5., 10.]) >>> torch.linspace(start=-10, end=10, steps=1) tensor([-10.])
- d2l.torch.load_array(data_arrays, batch_size, is_train=True)[source]¶
Construct a PyTorch data iterator.
- d2l.torch.load_corpus_time_machine(max_tokens=- 1)[source]¶
Return token indices and the vocabulary of the time machine dataset.
- d2l.torch.load_data_fashion_mnist(batch_size, resize=None)[source]¶
Download the Fashion-MNIST dataset and then load it into memory.
- d2l.torch.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶
Return the iterator and the vocabularies of the translation dataset.
- d2l.torch.load_data_snli(batch_size, num_steps=50)[source]¶
Download the SNLI dataset and return data iterators and vocabulary.
- d2l.torch.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶
Return the iterator and the vocabulary of the time machine dataset.
- d2l.torch.load_data_voc(batch_size, crop_size)[source]¶
Download and load the VOC2012 semantic dataset.
- d2l.torch.log(input, *, out=None) Tensor ¶
Returns a new tensor with the natural logarithm of the elements of
input
.(19.7.8)¶\[y_{i} = \log_{e} (x_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(5) >>> a tensor([-0.7168, -0.5471, -0.8933, -1.4428, -0.1190]) >>> torch.log(a) tensor([ nan, nan, nan, nan, nan])
- d2l.torch.masked_softmax(X, valid_lens)[source]¶
Perform softmax operation by masking elements on the last axis.
- d2l.torch.match_anchor_to_bbox(ground_truth, anchors, device, iou_threshold=0.5)[source]¶
Assign ground-truth bounding boxes to anchor boxes similar to them.
- d2l.torch.matmul(input, other, *, out=None) Tensor ¶
Matrix product of two tensors.
The behavior depends on the dimensionality of the tensors as follows:
If both tensors are 1-dimensional, the dot product (scalar) is returned.
If both arguments are 2-dimensional, the matrix-matrix product is returned.
If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.
If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.
If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned. If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after. If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after. The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if
input
is a \((j \times 1 \times n \times n)\) tensor andother
is a \((k \times n \times n)\) tensor,out
will be a \((j \times k \times n \times n)\) tensor.Note that the broadcasting logic only looks at the batch dimensions when determining if the inputs are broadcastable, and not the matrix dimensions. For example, if
input
is a \((j \times 1 \times n \times m)\) tensor andother
is a \((k \times m \times p)\) tensor, these inputs are valid for broadcasting even though the final two dimensions (i.e. the matrix dimensions) are different.out
will be a \((j \times k \times n \times p)\) tensor.
This operator supports TensorFloat32.
Note
The 1-dimensional dot product version of this function does not support an
out
parameter.- Arguments:
input (Tensor): the first tensor to be multiplied other (Tensor): the second tensor to be multiplied
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> # vector x vector >>> tensor1 = torch.randn(3) >>> tensor2 = torch.randn(3) >>> torch.matmul(tensor1, tensor2).size() torch.Size([]) >>> # matrix x vector >>> tensor1 = torch.randn(3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([3]) >>> # batched matrix x broadcasted vector >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3]) >>> # batched matrix x batched matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(10, 4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5]) >>> # batched matrix x broadcasted matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5])
- d2l.torch.meshgrid(*tensors)[source]¶
Take \(N\) tensors, each of which can be either scalar or 1-dimensional vector, and create \(N\) N-dimensional grids, where the \(i\) th grid is defined by expanding the \(i\) th input over dimensions defined by other inputs.
- Args:
- tensors (list of Tensor): list of scalars or 1 dimensional tensors. Scalars will be
treated as tensors of size \((1,)\) automatically
- Returns:
seq (sequence of Tensors): If the input has \(k\) tensors of size \((N_1,), (N_2,), \ldots , (N_k,)\), then the output would also have \(k\) tensors, where all tensors are of size \((N_1, N_2, \ldots , N_k)\).
Example:
>>> x = torch.tensor([1, 2, 3]) >>> y = torch.tensor([4, 5, 6]) >>> grid_x, grid_y = torch.meshgrid(x, y) >>> grid_x tensor([[1, 1, 1], [2, 2, 2], [3, 3, 3]]) >>> grid_y tensor([[4, 5, 6], [4, 5, 6], [4, 5, 6]])
- d2l.torch.multibox_detection(cls_probs, offset_preds, anchors, nms_threshold=0.5, pos_threshold=0.00999999978)[source]¶
- d2l.torch.normal(mean, std, *, generator=None, out=None) Tensor ¶
Returns a tensor of random numbers drawn from separate normal distributions whose mean and standard deviation are given.
The
mean
is a tensor with the mean of each output element’s normal distributionThe
std
is a tensor with the standard deviation of each output element’s normal distributionThe shapes of
mean
andstd
don’t need to match, but the total number of elements in each tensor need to be the same.Note
When the shapes do not match, the shape of
mean
is used as the shape for the returned output tensor- Args:
mean (Tensor): the tensor of per-element means std (Tensor): the tensor of per-element standard deviations
- Keyword args:
generator (
torch.Generator
, optional): a pseudorandom number generator for sampling out (Tensor, optional): the output tensor.
Example:
>>> torch.normal(mean=torch.arange(1., 11.), std=torch.arange(1, 0, -0.1)) tensor([ 1.0425, 3.5672, 2.7969, 4.2925, 4.7229, 6.2134, 8.0505, 8.1408, 9.0563, 10.0566])
- d2l.torch.normal(mean=0.0, std, *, out=None) Tensor ¶
Similar to the function above, but the means are shared among all drawn elements.
- Args:
mean (float, optional): the mean for all distributions std (Tensor): the tensor of per-element standard deviations
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> torch.normal(mean=0.5, std=torch.arange(1., 6.)) tensor([-1.2793, -1.0732, -2.0687, 5.1177, -1.2303])
- d2l.torch.normal(mean, std=1.0, *, out=None) Tensor ¶
Similar to the function above, but the standard-deviations are shared among all drawn elements.
- Args:
mean (Tensor): the tensor of per-element means std (float, optional): the standard deviation for all distributions
- Keyword args:
out (Tensor, optional): the output tensor
Example:
>>> torch.normal(mean=torch.arange(1., 6.)) tensor([ 1.1552, 2.6148, 2.6535, 5.8318, 4.2361])
- d2l.torch.normal(mean, std, size, *, out=None) Tensor ¶
Similar to the function above, but the means and standard deviations are shared among all drawn elements. The resulting tensor has size given by
size
.- Args:
mean (float): the mean for all distributions std (float): the standard deviation for all distributions size (int…): a sequence of integers defining the shape of the output tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> torch.normal(2, 3, size=(1, 4)) tensor([[-1.3987, -1.9544, 3.6048, 0.7909]])
- d2l.torch.numpy(x, *args, **kwargs)¶
- d2l.torch.ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument
size
.- Args:
- size (int…): a sequence of integers defining the shape of the output tensor.
Can be a variable number of arguments or a collection like a list or tuple.
- Keyword arguments:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
Example:
>>> torch.ones(2, 3) tensor([[ 1., 1., 1.], [ 1., 1., 1.]]) >>> torch.ones(5) tensor([ 1., 1., 1., 1., 1.])
- d2l.torch.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶
Plot data points.
- d2l.torch.predict_ch8(prefix, num_preds, net, vocab, device)[source]¶
Generate new characters following the prefix.
- d2l.torch.predict_seq2seq(net, src_sentence, src_vocab, tgt_vocab, num_steps, device, save_attention_weights=False)[source]¶
Predict for sequence to sequence.
- d2l.torch.rand(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Returns a tensor filled with random numbers from a uniform distribution on the interval \([0, 1)\)
The shape of the tensor is defined by the variable argument
size
.- Args:
- size (int…): a sequence of integers defining the shape of the output tensor.
Can be a variable number of arguments or a collection like a list or tuple.
- Keyword args:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
Example:
>>> torch.rand(4) tensor([ 0.5204, 0.2503, 0.3525, 0.5673]) >>> torch.rand(2, 3) tensor([[ 0.8237, 0.5781, 0.6879], [ 0.3816, 0.7249, 0.0998]])
- d2l.torch.read_snli(data_dir, is_train)[source]¶
Read the SNLI dataset into premises, hypotheses, and labels.
- d2l.torch.reduce_sum(x, *args, **kwargs)¶
- d2l.torch.reshape(x, *args, **kwargs)¶
- d2l.torch.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using random sampling.
- d2l.torch.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using sequential partitioning.
- d2l.torch.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶
Set the axes for matplotlib.
- d2l.torch.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶
- d2l.torch.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶
Plot a list of images.
- d2l.torch.sin(input, *, out=None) Tensor ¶
Returns a new tensor with the sine of the elements of
input
.(19.7.9)¶\[\text{out}_{i} = \sin(\text{input}_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.5461, 0.1347, -2.7266, -0.2746]) >>> torch.sin(a) tensor([-0.5194, 0.1343, -0.4032, -0.2711])
- d2l.torch.sinh(input, *, out=None) Tensor ¶
Returns a new tensor with the hyperbolic sine of the elements of
input
.(19.7.10)¶\[\text{out}_{i} = \sinh(\text{input}_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.5380, -0.8632, -0.1265, 0.9399]) >>> torch.sinh(a) tensor([ 0.5644, -0.9744, -0.1268, 1.0845])
Note
When
input
is on the CPU, the implementation of torch.sinh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.
- d2l.torch.size(x, *args, **kwargs)¶
- d2l.torch.stack(tensors, dim=0, *, out=None) Tensor ¶
Concatenates a sequence of tensors along a new dimension.
All tensors need to be of the same size.
- Arguments:
tensors (sequence of Tensors): sequence of tensors to concatenate dim (int): dimension to insert. Has to be between 0 and the number
of dimensions of concatenated tensors (inclusive)
- Keyword args:
out (Tensor, optional): the output tensor.
- d2l.torch.tanh(input, *, out=None) Tensor ¶
Returns a new tensor with the hyperbolic tangent of the elements of
input
.(19.7.11)¶\[\text{out}_{i} = \tanh(\text{input}_{i})\]- Args:
input (Tensor): the input tensor.
- Keyword args:
out (Tensor, optional): the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.8986, -0.7279, 1.1745, 0.2611]) >>> torch.tanh(a) tensor([ 0.7156, -0.6218, 0.8257, 0.2553])
- d2l.torch.tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) Tensor ¶
Constructs a tensor with
data
.Warning
torch.tensor()
always copiesdata
. If you have a Tensordata
and want to avoid a copy, usetorch.Tensor.requires_grad_()
ortorch.Tensor.detach()
. If you have a NumPyndarray
and want to avoid a copy, usetorch.as_tensor()
.Warning
When data is a tensor x,
torch.tensor()
reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Thereforetorch.tensor(x)
is equivalent tox.clone().detach()
andtorch.tensor(x, requires_grad=True)
is equivalent tox.clone().detach().requires_grad_(True)
. The equivalents usingclone()
anddetach()
are recommended.- Args:
- data (array_like): Initial data for the tensor. Can be a list, tuple,
NumPy
ndarray
, scalar, and other types.
- Keyword args:
- dtype (
torch.dtype
, optional): the desired data type of returned tensor. Default: if
None
, infers data type fromdata
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.- pin_memory (bool, optional): If set, returned tensor would be allocated in
the pinned memory. Works only for CPU tensors. Default:
False
.
- dtype (
Example:
>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]]) tensor([[ 0.1000, 1.2000], [ 2.2000, 3.1000], [ 4.9000, 5.2000]]) >>> torch.tensor([0, 1]) # Type inference on data tensor([ 0, 1]) >>> torch.tensor([[0.11111, 0.222222, 0.3333333]], ... dtype=torch.float64, ... device=torch.device('cuda:0')) # creates a torch.cuda.DoubleTensor tensor([[ 0.1111, 0.2222, 0.3333]], dtype=torch.float64, device='cuda:0') >>> torch.tensor(3.14159) # Create a scalar (zero-dimensional tensor) tensor(3.1416) >>> torch.tensor([]) # Create an empty tensor (of size (0,)) tensor([])
- d2l.torch.to(x, *args, **kwargs)¶
- d2l.torch.train_2d(trainer, steps=20)[source]¶
Optimize a 2-dim objective function with a customized trainer.
- d2l.torch.train_ch11(trainer_fn, states, hyperparams, data_iter, feature_dim, num_epochs=2)[source]¶
- d2l.torch.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[device(type='cuda', index=0), device(type='cuda', index=1), device(type='cuda', index=2), device(type='cuda', index=3)])[source]¶
- d2l.torch.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)[source]¶
Train a model (defined in Chapter 3).
- d2l.torch.train_ch6(net, train_iter, test_iter, num_epochs, lr, device)[source]¶
Train a model with a GPU (defined in Chapter 6).
- d2l.torch.train_ch8(net, train_iter, vocab, lr, num_epochs, device, use_random_iter=False)[source]¶
Train a model (defined in Chapter 8).
- d2l.torch.train_epoch_ch3(net, train_iter, loss, updater)[source]¶
The training loop defined in Chapter 3.
- d2l.torch.train_epoch_ch8(net, train_iter, loss, updater, device, use_random_iter)[source]¶
Train a net within one epoch (defined in Chapter 8).
- d2l.torch.train_seq2seq(net, data_iter, lr, num_epochs, tgt_vocab, device)[source]¶
Train a model for sequence to sequence.
- d2l.torch.transpose(x, *args, **kwargs)¶
- d2l.torch.voc_rand_crop(feature, label, height, width)[source]¶
Randomly crop for both feature and label images.
- d2l.torch.zeros(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) Tensor ¶
Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument
size
.- Args:
- size (int…): a sequence of integers defining the shape of the output tensor.
Can be a variable number of arguments or a collection like a list or tuple.
- Keyword args:
out (Tensor, optional): the output tensor. dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
- layout (
Example:
>>> torch.zeros(2, 3) tensor([[ 0., 0., 0.], [ 0., 0., 0.]]) >>> torch.zeros(5) tensor([ 0., 0., 0., 0., 0.])
- class d2l.tensorflow.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶
Bases:
object
For plotting data in animation.
- class d2l.tensorflow.RNNModelScratch(vocab_size, num_hiddens, init_state, forward_fn)[source]¶
Bases:
object
A RNN Model implemented from scratch.
- class d2l.tensorflow.Residual(*args, **kwargs)[source]¶
Bases:
tensorflow.python.keras.engine.training.Model
The Residual block of ResNet.
- call(X)[source]¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
- Arguments:
inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run
the Network in training mode or inference mode.
- mask: A mask or list of masks. A mask can be
either a tensor or None (no mask).
- Returns:
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class d2l.tensorflow.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶
Bases:
object
An iterator to load sequence data.
- class d2l.tensorflow.TrainCallback(net, train_iter, test_iter, num_epochs, device_name)[source]¶
Bases:
tensorflow.python.keras.callbacks.Callback
A callback to visiualize the training progress.
- on_epoch_begin(epoch, logs=None)[source]¶
Called at the start of an epoch.
Subclasses should override for any actions to run. This function should only be called during TRAIN mode.
- Arguments:
epoch: Integer, index of epoch. logs: Dict. Currently no data is passed to this argument for this method
but that may change in the future.
- on_epoch_end(epoch, logs)[source]¶
Called at the end of an epoch.
Subclasses should override for any actions to run. This function should only be called during TRAIN mode.
- Arguments:
epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the
validation epoch if validation is performed. Validation result keys are prefixed with val_.
- class d2l.tensorflow.Updater(params, lr)[source]¶
Bases:
object
For updating parameters using minibatch stochastic gradient descent.
- class d2l.tensorflow.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶
Bases:
object
Vocabulary for text.
- d2l.tensorflow.abs(x, name=None)[source]¶
Computes the absolute value of a tensor.
Given a tensor of integer or floating-point values, this operation returns a tensor of the same type, where each element contains the absolute value of the corresponding element in the input.
Given a tensor x of complex numbers, this operation returns a tensor of type float32 or float64 that is the absolute value of each element in x. For a complex number \(a + bj\), its absolute value is computed as \(sqrt{a^2 + b^2}\). For example:
>>> x = tf.constant([[-2.25 + 4.75j], [-3.25 + 5.75j]]) >>> tf.abs(x) <tf.Tensor: shape=(2, 1), dtype=float64, numpy= array([[5.25594901], [6.60492241]])>
- Args:
- x: A Tensor or SparseTensor of type float16, float32, float64,
int32, int64, complex64 or complex128.
name: A name for the operation (optional).
- Returns:
- A Tensor or SparseTensor of the same size, type and sparsity as x,
with absolute values. Note, for complex64 or complex128 input, the returned Tensor will be of type float32 or float64, respectively.
If x is a SparseTensor, returns SparseTensor(x.indices, tf.math.abs(x.values, …), x.dense_shape)
- d2l.tensorflow.arange(start, limit=None, delta=1, dtype=None, name='range')¶
Creates a sequence of numbers.
Creates a sequence of numbers that begins at start and extends by increments of delta up to but not including limit.
The dtype of the resulting tensor is inferred from the inputs unless it is provided explicitly.
Like the Python builtin range, start defaults to 0, so that range(n) = range(0, n).
For example:
>>> start = 3 >>> limit = 18 >>> delta = 3 >>> tf.range(start, limit, delta) <tf.Tensor: shape=(5,), dtype=int32, numpy=array([ 3, 6, 9, 12, 15], dtype=int32)>
>>> start = 3 >>> limit = 1 >>> delta = -0.5 >>> tf.range(start, limit, delta) <tf.Tensor: shape=(4,), dtype=float32, numpy=array([3. , 2.5, 2. , 1.5], dtype=float32)>
>>> limit = 5 >>> tf.range(limit) <tf.Tensor: shape=(5,), dtype=int32, numpy=array([0, 1, 2, 3, 4], dtype=int32)>
- Args:
- start: A 0-D Tensor (scalar). Acts as first entry in the range if limit
is not None; otherwise, acts as range limit and first entry defaults to 0.
- limit: A 0-D Tensor (scalar). Upper limit of sequence, exclusive. If None,
defaults to the value of start while the first entry of the range defaults to 0.
- delta: A 0-D Tensor (scalar). Number that increments start. Defaults to
dtype: The type of the elements of the resulting tensor. name: A name for the operation. Defaults to “range”.
- Returns:
An 1-D Tensor of type dtype.
@compatibility(numpy) Equivalent to np.arange @end_compatibility
- d2l.tensorflow.argmax(input, axis=None, output_type=tf.int64, name=None)[source]¶
Returns the index with the largest value across axes of a tensor.
In case of identity returns the smallest index.
For example:
>>> A = tf.constant([2, 20, 30, 3, 6]) >>> tf.math.argmax(A) # A[2] is maximum in tensor A <tf.Tensor: shape=(), dtype=int64, numpy=2> >>> B = tf.constant([[2, 20, 30, 3, 6], [3, 11, 16, 1, 8], ... [14, 45, 23, 5, 27]]) >>> tf.math.argmax(B, 0) <tf.Tensor: shape=(5,), dtype=int64, numpy=array([2, 2, 0, 2, 2])> >>> tf.math.argmax(B, 1) <tf.Tensor: shape=(3,), dtype=int64, numpy=array([2, 2, 1])> >>> C = tf.constant([0, 0, 0, 0]) >>> tf.math.argmax(C) # Returns smallest index in case of ties <tf.Tensor: shape=(), dtype=int64, numpy=0>
- Args:
input: A Tensor. axis: An integer, the axis to reduce across. Default to 0. output_type: An optional output dtype (tf.int32 or tf.int64). Defaults
to tf.int64.
name: An optional name for the operation.
- Returns:
A Tensor of type output_type.
- d2l.tensorflow.astype(x, dtype, name=None)¶
Casts a tensor to a new type.
The operation casts x (in case of Tensor) or x.values (in case of SparseTensor or IndexedSlices) to dtype.
For example:
>>> x = tf.constant([1.8, 2.2], dtype=tf.float32) >>> tf.dtypes.cast(x, tf.int32) <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>
The operation supports data types (for x and dtype) of uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, complex64, complex128, bfloat16. In case of casting from complex types (complex64, complex128) to real types, only the real part of x is returned. In case of casting from real types to complex types (complex64, complex128), the imaginary part of the returned value is set to 0. The handling of complex types here matches the behavior of numpy.
- Args:
- x: A Tensor or SparseTensor or IndexedSlices of numeric type. It could
be uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, complex64, complex128, bfloat16.
- dtype: The destination type. The list of supported dtypes is the same as
x.
name: A name for the operation (optional).
- Returns:
- A Tensor or SparseTensor or IndexedSlices with same shape as x and
same type as dtype.
- Raises:
TypeError: If x cannot be cast to the dtype.
- d2l.tensorflow.box_center_to_corner(boxes)[source]¶
Convert from (center, width, height) to (upper_left, bottom_right)
- d2l.tensorflow.box_corner_to_center(boxes)[source]¶
Convert from (upper_left, bottom_right) to (center, width, height)
- d2l.tensorflow.build_array_nmt(lines, vocab, num_steps)[source]¶
Transform text sequences of machine translation into minibatches.
- d2l.tensorflow.concat(values, axis, name='concat')[source]¶
Concatenates tensors along one dimension.
See also tf.tile, tf.stack, tf.repeat.
Concatenates the list of tensors values along dimension axis. If values[i].shape = [D0, D1, … Daxis(i), …Dn], the concatenated result has shape
[D0, D1, … Raxis, …Dn]
where
Raxis = sum(Daxis(i))
That is, the data from the input tensors is joined along the axis dimension.
The number of dimensions of the input tensors must match, and all dimensions except axis must be equal.
For example:
>>> t1 = [[1, 2, 3], [4, 5, 6]] >>> t2 = [[7, 8, 9], [10, 11, 12]] >>> tf.concat([t1, t2], 0) <tf.Tensor: shape=(4, 3), dtype=int32, numpy= array([[ 1, 2, 3], [ 4, 5, 6], [ 7, 8, 9], [10, 11, 12]], dtype=int32)>
>>> tf.concat([t1, t2], 1) <tf.Tensor: shape=(2, 6), dtype=int32, numpy= array([[ 1, 2, 3, 7, 8, 9], [ 4, 5, 6, 10, 11, 12]], dtype=int32)>
As in Python, the axis could also be negative numbers. Negative axis are interpreted as counting from the end of the rank, i.e.,
axis + rank(values)-th dimension.
For example:
>>> t1 = [[[1, 2], [2, 3]], [[4, 4], [5, 3]]] >>> t2 = [[[7, 4], [8, 4]], [[2, 10], [15, 11]]] >>> tf.concat([t1, t2], -1) <tf.Tensor: shape=(2, 2, 4), dtype=int32, numpy= array([[[ 1, 2, 7, 4], [ 2, 3, 8, 4]], [[ 4, 4, 2, 10], [ 5, 3, 15, 11]]], dtype=int32)>
Note: If you are concatenating along a new axis consider using stack. E.g.
`python tf.concat([tf.expand_dims(t, axis) for t in tensors], axis) `
can be rewritten as
`python tf.stack(tensors, axis=axis) `
- Args:
values: A list of Tensor objects or a single Tensor. axis: 0-D int32 Tensor. Dimension along which to concatenate. Must be
in the range [-rank(values), rank(values)). As in Python, indexing for axis is 0-based. Positive axis in the rage of [0, rank(values)) refers to axis-th dimension. And negative axis refers to axis + rank(values)-th dimension.
name: A name for the operation (optional).
- Returns:
A Tensor resulting from concatenation of the input tensors.
- d2l.tensorflow.cos(x, name=None)[source]¶
Computes cos of x element-wise.
Given an input tensor, this function computes cosine of every element in the tensor. Input range is (-inf, inf) and output range is [-1,1]. If input lies outside the boundary, nan is returned.
`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 200, 10000, float("inf")]) tf.math.cos(x) ==> [nan -0.91113025 0.87758255 0.5403023 0.36235774 0.48718765 -0.95215535 nan] `
- Args:
x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
- Returns:
A Tensor. Has the same type as x.
- d2l.tensorflow.cosh(x, name=None)[source]¶
Computes hyperbolic cosine of x element-wise.
Given an input tensor, this function computes hyperbolic cosine of every element in the tensor. Input range is [-inf, inf] and output range is [1, inf].
`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 2, 10, float("inf")]) tf.math.cosh(x) ==> [inf 4.0515420e+03 1.1276259e+00 1.5430807e+00 1.8106556e+00 3.7621956e+00 1.1013233e+04 inf] `
- Args:
x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
- Returns:
A Tensor. Has the same type as x.
- class d2l.tensorflow.defaultdict¶
Bases:
dict
defaultdict(default_factory[, …]) –> dict with default factory
The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.
- copy() a shallow copy of D. ¶
- default_factory¶
Factory for default value called by __missing__().
- d2l.tensorflow.download(name, cache_dir='../data')[source]¶
Download a file inserted into DATA_HUB, return the local filename.
- d2l.tensorflow.evaluate_accuracy(net, data_iter)[source]¶
Compute the accuracy for a model on a dataset.
- d2l.tensorflow.evaluate_loss(net, data_iter, loss)[source]¶
Evaluate the loss of a model on the given dataset.
- d2l.tensorflow.exp(x, name=None)[source]¶
Computes exponential of x element-wise. \(y = e^x\).
This function computes the exponential of the input tensor element-wise. i.e. math.exp(x) or \(e^x\), where x is the input tensor. \(e\) denotes Euler’s number and is approximately equal to 2.718281. Output is positive for any real input.
>>> x = tf.constant(2.0) >>> tf.math.exp(x) <tf.Tensor: shape=(), dtype=float32, numpy=7.389056>
>>> x = tf.constant([2.0, 8.0]) >>> tf.math.exp(x) <tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 7.389056, 2980.958 ], dtype=float32)>
For complex numbers, the exponential value is calculated as \(e^{x+iy}={e^x}{e^{iy}}={e^x}{\cos(y)+i\sin(y)}\)
For 1+1j the value would be computed as: \(e^1{\cos(1)+i\sin(1)} = 2.7182817 \times (0.5403023+0.84147096j)\)
>>> x = tf.constant(1 + 1j) >>> tf.math.exp(x) <tf.Tensor: shape=(), dtype=complex128, numpy=(1.4686939399158851+2.2873552871788423j)>
- Args:
- x: A tf.Tensor. Must be one of the following types: bfloat16, half,
float32, float64, complex64, complex128.
name: A name for the operation (optional).
- Returns:
A tf.Tensor. Has the same type as x.
@compatibility(numpy) Equivalent to np.exp @end_compatibility
- d2l.tensorflow.eye(num_rows, num_columns=None, batch_shape=None, dtype=tf.float32, name=None)[source]¶
Construct an identity matrix, or a batch of matrices.
See also tf.ones, tf.zeros, tf.fill, tf.one_hot.
```python # Construct one identity matrix. tf.eye(2) ==> [[1., 0.],
[0., 1.]]
# Construct a batch of 3 identity matrices, each 2 x 2. # batch_identity[i, :, :] is a 2 x 2 identity matrix, i = 0, 1, 2. batch_identity = tf.eye(2, batch_shape=[3])
# Construct one 2 x 3 “identity” matrix tf.eye(2, num_columns=3) ==> [[ 1., 0., 0.],
[ 0., 1., 0.]]
- Args:
- num_rows: Non-negative int32 scalar Tensor giving the number of rows
in each batch matrix.
- num_columns: Optional non-negative int32 scalar Tensor giving the number
of columns in each batch matrix. Defaults to num_rows.
- batch_shape: A list or tuple of Python integers or a 1-D int32 Tensor.
If provided, the returned Tensor will have leading batch dimensions of this shape.
dtype: The type of an element in the resulting Tensor name: A name for this Op. Defaults to “eye”.
- Returns:
A Tensor of shape batch_shape + [num_rows, num_columns]
- d2l.tensorflow.get_fashion_mnist_labels(labels)[source]¶
Return text labels for the Fashion-MNIST dataset.
- d2l.tensorflow.linspace(start, stop, num, name=None, axis=0)¶
Generates evenly-spaced values in an interval along a given axis.
A sequence of num evenly-spaced values are generated beginning at start along a given axis. If num > 1, the values in the sequence increase by stop - start / num - 1, so that the last one is exactly stop. If num <= 0, ValueError is raised.
Matches [np.linspace](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html)’s behaviour except when num == 0.
For example:
` tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0 11.0 12.0] `
Start and stop can be tensors of arbitrary size:
>>> tf.linspace([0., 5.], [10., 40.], 5, axis=0) <tf.Tensor: shape=(5, 2), dtype=float32, numpy= array([[ 0. , 5. ], [ 2.5 , 13.75], [ 5. , 22.5 ], [ 7.5 , 31.25], [10. , 40. ]], dtype=float32)>
Axis is where the values will be generated (the dimension in the returned tensor which corresponds to the axis will be equal to num)
>>> tf.linspace([0., 5.], [10., 40.], 5, axis=-1) <tf.Tensor: shape=(2, 5), dtype=float32, numpy= array([[ 0. , 2.5 , 5. , 7.5 , 10. ], [ 5. , 13.75, 22.5 , 31.25, 40. ]], dtype=float32)>
- Args:
- start: A Tensor. Must be one of the following types: bfloat16,
float32, float64. N-D tensor. First entry in the range.
- stop: A Tensor. Must have the same type and shape as start. N-D tensor.
Last entry in the range.
- num: A Tensor. Must be one of the following types: int32, int64. 0-D
tensor. Number of values to generate.
name: A name for the operation (optional). axis: Axis along which the operation is performed (used only when N-D
tensors are provided).
- Returns:
A Tensor. Has the same type as start.
- d2l.tensorflow.load_array(data_arrays, batch_size, is_train=True)[source]¶
Construct a TensorFlow data iterator.
- d2l.tensorflow.load_corpus_time_machine(max_tokens=- 1)[source]¶
Return token indices and the vocabulary of the time machine dataset.
- d2l.tensorflow.load_data_fashion_mnist(batch_size, resize=None)[source]¶
Download the Fashion-MNIST dataset and then load it into memory.
- d2l.tensorflow.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶
Return the iterator and the vocabularies of the translation dataset.
- d2l.tensorflow.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶
Return the iterator and the vocabulary of the time machine dataset.
- d2l.tensorflow.matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False, a_is_sparse=False, b_is_sparse=False, name=None)[source]¶
Multiplies matrix a by matrix b, producing a * b.
The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication dimensions, and any further outer dimensions specify matching batch size.
Both matrices must be of the same type. The supported types are: float16, float32, float64, int32, complex64, complex128.
Either matrix can be transposed or adjointed (conjugated and transposed) on the fly by setting one of the corresponding flag to True. These are False by default.
If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding a_is_sparse or b_is_sparse flag to True. These are False by default. This optimization is only available for plain matrices (rank-2 tensors) with datatypes bfloat16 or float32.
A simple 2-D tensor matrix multiplication:
>>> a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) >>> a # 2-D tensor <tf.Tensor: shape=(2, 3), dtype=int32, numpy= array([[1, 2, 3], [4, 5, 6]], dtype=int32)> >>> b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2]) >>> b # 2-D tensor <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[ 7, 8], [ 9, 10], [11, 12]], dtype=int32)> >>> c = tf.matmul(a, b) >>> c # `a` * `b` <tf.Tensor: shape=(2, 2), dtype=int32, numpy= array([[ 58, 64], [139, 154]], dtype=int32)>
A batch matrix multiplication with batch shape [2]:
>>> a = tf.constant(np.arange(1, 13, dtype=np.int32), shape=[2, 2, 3]) >>> a # 3-D tensor <tf.Tensor: shape=(2, 2, 3), dtype=int32, numpy= array([[[ 1, 2, 3], [ 4, 5, 6]], [[ 7, 8, 9], [10, 11, 12]]], dtype=int32)> >>> b = tf.constant(np.arange(13, 25, dtype=np.int32), shape=[2, 3, 2]) >>> b # 3-D tensor <tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy= array([[[13, 14], [15, 16], [17, 18]], [[19, 20], [21, 22], [23, 24]]], dtype=int32)> >>> c = tf.matmul(a, b) >>> c # `a` * `b` <tf.Tensor: shape=(2, 2, 2), dtype=int32, numpy= array([[[ 94, 100], [229, 244]], [[508, 532], [697, 730]]], dtype=int32)>
Since python >= 3.5 the @ operator is supported (see [PEP 465](https://www.python.org/dev/peps/pep-0465/)). In TensorFlow, it simply calls the tf.matmul() function, so the following lines are equivalent:
>>> d = a @ b @ [[10], [11]] >>> d = tf.matmul(tf.matmul(a, b), [[10], [11]])
- Args:
- a: tf.Tensor of type float16, float32, float64, int32,
complex64, complex128 and rank > 1.
b: tf.Tensor with same type and rank as a. transpose_a: If True, a is transposed before multiplication. transpose_b: If True, b is transposed before multiplication. adjoint_a: If True, a is conjugated and transposed before
multiplication.
- adjoint_b: If True, b is conjugated and transposed before
multiplication.
- a_is_sparse: If True, a is treated as a sparse matrix. Notice, this
does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.
- b_is_sparse: If True, b is treated as a sparse matrix. Notice, this
does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.
name: Name for the operation (optional).
- Returns:
A tf.Tensor of the same type as a and b where each inner-most matrix is the product of the corresponding matrices in a and b, e.g. if all transpose or adjoint attributes are False:
output[…, i, j] = sum_k (a[…, i, k] * b[…, k, j]), for all indices i, j.
Note: This is matrix product, not element-wise product.
- Raises:
- ValueError: If transpose_a and adjoint_a, or transpose_b and
adjoint_b are both set to True.
- d2l.tensorflow.meshgrid(*args, **kwargs)[source]¶
Broadcasts parameters for evaluation on an N-D grid.
Given N one-dimensional coordinate arrays *args, returns a list outputs of N-D coordinate arrays for evaluating expressions on an N-D grid.
Notes:
meshgrid supports cartesian (‘xy’) and matrix (‘ij’) indexing conventions. When the indexing argument is set to ‘xy’ (the default), the broadcasting instructions for the first two dimensions are swapped.
Examples:
Calling X, Y = meshgrid(x, y) with the tensors
`python x = [1, 2, 3] y = [4, 5, 6] X, Y = tf.meshgrid(x, y) # X = [[1, 2, 3], # [1, 2, 3], # [1, 2, 3]] # Y = [[4, 4, 4], # [5, 5, 5], # [6, 6, 6]] `
- Args:
*args: `Tensor`s with rank 1. **kwargs:
indexing: Either ‘xy’ or ‘ij’ (optional, default: ‘xy’).
name: A name for the operation (optional).
- Returns:
outputs: A list of N `Tensor`s with rank N.
- Raises:
TypeError: When no keyword arguments (kwargs) are passed. ValueError: When indexing keyword argument is not one of xy or ij.
- d2l.tensorflow.normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)¶
Outputs random values from a normal distribution.
Example that generates a new set of random values every time:
>>> tf.random.set_seed(5); >>> tf.random.normal([4], 0, 1, tf.float32) <tf.Tensor: shape=(4,), dtype=float32, numpy=..., dtype=float32)>
Example that outputs a reproducible result:
>>> tf.random.set_seed(5); >>> tf.random.normal([2,2], 0, 1, tf.float32, seed=1) <tf.Tensor: shape=(2, 2), dtype=float32, numpy= array([[-1.3768897 , -0.01258316], [-0.169515 , 1.0824056 ]], dtype=float32)>
In this case, we are setting both the global and operation-level seed to ensure this result is reproducible. See tf.random.set_seed for more information.
- Args:
shape: A 1-D integer Tensor or Python array. The shape of the output tensor. mean: A Tensor or Python value of type dtype, broadcastable with stddev.
The mean of the normal distribution.
- stddev: A Tensor or Python value of type dtype, broadcastable with mean.
The standard deviation of the normal distribution.
dtype: The type of the output. seed: A Python integer. Used to create a random seed for the distribution.
See tf.random.set_seed for behavior.
name: A name for the operation (optional).
- Returns:
A tensor of the specified shape filled with random normal values.
- d2l.tensorflow.numpy(x, *args, **kwargs)¶
- d2l.tensorflow.ones(shape, dtype=tf.float32, name=None)[source]¶
Creates a tensor with all elements set to one (1).
See also tf.ones_like, tf.zeros, tf.fill, tf.eye.
This operation returns a tensor of type dtype with shape shape and all elements set to one.
>>> tf.ones([3, 4], tf.int32) <tf.Tensor: shape=(3, 4), dtype=int32, numpy= array([[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]], dtype=int32)>
- Args:
- shape: A list of integers, a tuple of integers, or
a 1-D Tensor of type int32.
- dtype: Optional DType of an element in the resulting Tensor. Default is
tf.float32.
name: Optional string. A name for the operation.
- Returns:
A Tensor with all elements set to one (1).
- d2l.tensorflow.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶
Plot data points.
- d2l.tensorflow.predict_ch8(prefix, num_preds, net, vocab, params)[source]¶
Generate new characters following the prefix.
- d2l.tensorflow.rand(shape, minval=0, maxval=None, dtype=tf.float32, seed=None, name=None)¶
Outputs random values from a uniform distribution.
The generated values follow a uniform distribution in the range [minval, maxval). The lower bound minval is included in the range, while the upper bound maxval is excluded.
For floats, the default range is [0, 1). For ints, at least maxval must be specified explicitly.
In the integer case, the random integers are slightly biased unless maxval - minval is an exact power of two. The bias is small for values of maxval - minval significantly smaller than the range of the output (either 2**32 or 2**64).
Examples:
>>> tf.random.uniform(shape=[2]) <tf.Tensor: shape=(2,), dtype=float32, numpy=array([..., ...], dtype=float32)> >>> tf.random.uniform(shape=[], minval=-1., maxval=0.) <tf.Tensor: shape=(), dtype=float32, numpy=-...> >>> tf.random.uniform(shape=[], minval=5, maxval=10, dtype=tf.int64) <tf.Tensor: shape=(), dtype=int64, numpy=...>
The seed argument produces a deterministic sequence of tensors across multiple calls. To repeat that sequence, use tf.random.set_seed:
>>> tf.random.set_seed(5) >>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10) <tf.Tensor: shape=(), dtype=int32, numpy=2> >>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10) <tf.Tensor: shape=(), dtype=int32, numpy=0> >>> tf.random.set_seed(5) >>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10) <tf.Tensor: shape=(), dtype=int32, numpy=2> >>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10) <tf.Tensor: shape=(), dtype=int32, numpy=0>
Without tf.random.set_seed but with a seed argument is specified, small changes to function graphs or previously executed operations will change the returned value. See tf.random.set_seed for details.
- Args:
shape: A 1-D integer Tensor or Python array. The shape of the output tensor. minval: A Tensor or Python value of type dtype, broadcastable with
shape (for integer types, broadcasting is not supported, so it needs to be a scalar). The lower bound on the range of random values to generate (inclusive). Defaults to 0.
- maxval: A Tensor or Python value of type dtype, broadcastable with
shape (for integer types, broadcasting is not supported, so it needs to be a scalar). The upper bound on the range of random values to generate (exclusive). Defaults to 1 if dtype is floating point.
- dtype: The type of the output: float16, float32, float64, int32,
or int64.
- seed: A Python integer. Used in combination with tf.random.set_seed to
create a reproducible sequence of tensors across multiple calls.
name: A name for the operation (optional).
- Returns:
A tensor of the specified shape filled with random uniform values.
- Raises:
ValueError: If dtype is integral and maxval is not specified.
- d2l.tensorflow.read_time_machine()[source]¶
Load the time machine dataset into a list of text lines.
- d2l.tensorflow.reduce_sum(input_tensor, axis=None, keepdims=False, name=None)[source]¶
Computes the sum of elements across dimensions of a tensor.
Reduces input_tensor along the dimensions given in axis. Unless keepdims is true, the rank of the tensor is reduced by 1 for each entry in axis. If keepdims is true, the reduced dimensions are retained with length 1.
If axis is None, all dimensions are reduced, and a tensor with a single element is returned.
For example:
>>> # x has a shape of (2, 3) (two rows and three columns): >>> x = tf.constant([[1, 1, 1], [1, 1, 1]]) >>> x.numpy() array([[1, 1, 1], [1, 1, 1]], dtype=int32) >>> # sum all the elements >>> # 1 + 1 + 1 + 1 + 1+ 1 = 6 >>> tf.reduce_sum(x).numpy() 6 >>> # reduce along the first dimension >>> # the result is [1, 1, 1] + [1, 1, 1] = [2, 2, 2] >>> tf.reduce_sum(x, 0).numpy() array([2, 2, 2], dtype=int32) >>> # reduce along the second dimension >>> # the result is [1, 1] + [1, 1] + [1, 1] = [3, 3] >>> tf.reduce_sum(x, 1).numpy() array([3, 3], dtype=int32) >>> # keep the original dimensions >>> tf.reduce_sum(x, 1, keepdims=True).numpy() array([[3], [3]], dtype=int32) >>> # reduce along both dimensions >>> # the result is 1 + 1 + 1 + 1 + 1 + 1 = 6 >>> # or, equivalently, reduce along rows, then reduce the resultant array >>> # [1, 1, 1] + [1, 1, 1] = [2, 2, 2] >>> # 2 + 2 + 2 = 6 >>> tf.reduce_sum(x, [0, 1]).numpy() 6
- Args:
input_tensor: The tensor to reduce. Should have numeric type. axis: The dimensions to reduce. If None (the default), reduces all
dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)].
keepdims: If true, retains reduced dimensions with length 1. name: A name for the operation (optional).
- Returns:
The reduced tensor, of the same dtype as the input_tensor.
@compatibility(numpy) Equivalent to np.sum apart the fact that numpy upcast uint8 and int32 to int64 while tensorflow returns the same dtype as the input. @end_compatibility
- d2l.tensorflow.reshape(tensor, shape, name=None)[source]¶
Reshapes a tensor.
Given tensor, this operation returns a new tf.Tensor that has the same values as tensor in the same order, except with a new shape given by shape.
>>> t1 = [[1, 2, 3], ... [4, 5, 6]] >>> print(tf.shape(t1).numpy()) [2 3] >>> t2 = tf.reshape(t1, [6]) >>> t2 <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)> >>> tf.reshape(t2, [3, 2]) <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[1, 2], [3, 4], [5, 6]], dtype=int32)>
The tf.reshape does not change the order of or the total number of elements in the tensor, and so it can reuse the underlying data buffer. This makes it a fast operation independent of how big of a tensor it is operating on.
>>> tf.reshape([1, 2, 3], [2, 2]) Traceback (most recent call last): ... InvalidArgumentError: Input to reshape is a tensor with 3 values, but the requested shape has 4
To instead reorder the data to rearrange the dimensions of a tensor, see tf.transpose.
>>> t = [[1, 2, 3], ... [4, 5, 6]] >>> tf.reshape(t, [3, 2]).numpy() array([[1, 2], [3, 4], [5, 6]], dtype=int32) >>> tf.transpose(t, perm=[1, 0]).numpy() array([[1, 4], [2, 5], [3, 6]], dtype=int32)
If one component of shape is the special value -1, the size of that dimension is computed so that the total size remains constant. In particular, a shape of [-1] flattens into 1-D. At most one component of shape can be -1.
>>> t = [[1, 2, 3], ... [4, 5, 6]] >>> tf.reshape(t, [-1]) <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)> >>> tf.reshape(t, [3, -1]) <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[1, 2], [3, 4], [5, 6]], dtype=int32)> >>> tf.reshape(t, [-1, 2]) <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[1, 2], [3, 4], [5, 6]], dtype=int32)>
tf.reshape(t, []) reshapes a tensor t with one element to a scalar.
>>> tf.reshape([7], []).numpy() 7
More examples:
>>> t = [1, 2, 3, 4, 5, 6, 7, 8, 9] >>> print(tf.shape(t).numpy()) [9] >>> tf.reshape(t, [3, 3]) <tf.Tensor: shape=(3, 3), dtype=int32, numpy= array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=int32)>
>>> t = [[[1, 1], [2, 2]], ... [[3, 3], [4, 4]]] >>> print(tf.shape(t).numpy()) [2 2 2] >>> tf.reshape(t, [2, 4]) <tf.Tensor: shape=(2, 4), dtype=int32, numpy= array([[1, 1, 2, 2], [3, 3, 4, 4]], dtype=int32)>
>>> t = [[[1, 1, 1], ... [2, 2, 2]], ... [[3, 3, 3], ... [4, 4, 4]], ... [[5, 5, 5], ... [6, 6, 6]]] >>> print(tf.shape(t).numpy()) [3 2 3] >>> # Pass '[-1]' to flatten 't'. >>> tf.reshape(t, [-1]) <tf.Tensor: shape=(18,), dtype=int32, numpy=array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6], dtype=int32)> >>> # -- Using -1 to infer the shape -- >>> # Here -1 is inferred to be 9: >>> tf.reshape(t, [2, -1]) <tf.Tensor: shape=(2, 9), dtype=int32, numpy= array([[1, 1, 1, 2, 2, 2, 3, 3, 3], [4, 4, 4, 5, 5, 5, 6, 6, 6]], dtype=int32)> >>> # -1 is inferred to be 2: >>> tf.reshape(t, [-1, 9]) <tf.Tensor: shape=(2, 9), dtype=int32, numpy= array([[1, 1, 1, 2, 2, 2, 3, 3, 3], [4, 4, 4, 5, 5, 5, 6, 6, 6]], dtype=int32)> >>> # -1 is inferred to be 3: >>> tf.reshape(t, [ 2, -1, 3]) <tf.Tensor: shape=(2, 3, 3), dtype=int32, numpy= array([[[1, 1, 1], [2, 2, 2], [3, 3, 3]], [[4, 4, 4], [5, 5, 5], [6, 6, 6]]], dtype=int32)>
- Args:
tensor: A Tensor. shape: A Tensor. Must be one of the following types: int32, int64.
Defines the shape of the output tensor.
name: Optional string. A name for the operation.
- Returns:
A Tensor. Has the same type as tensor.
- d2l.tensorflow.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using random sampling.
- d2l.tensorflow.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶
Generate a minibatch of subsequences using sequential partitioning.
- d2l.tensorflow.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶
Set the axes for matplotlib.
- d2l.tensorflow.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶
- d2l.tensorflow.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶
Plot a list of images.
- d2l.tensorflow.show_trace_2d(f, results)[source]¶
Show the trace of 2D variables during optimization.
- d2l.tensorflow.sin(x, name=None)[source]¶
Computes sine of x element-wise.
Given an input tensor, this function computes sine of every element in the tensor. Input range is (-inf, inf) and output range is [-1,1].
`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 200, 10, float("inf")]) tf.math.sin(x) ==> [nan -0.4121185 -0.47942555 0.84147096 0.9320391 -0.87329733 -0.54402107 nan] `
- Args:
x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
- Returns:
A Tensor. Has the same type as x.
- d2l.tensorflow.sinh(x, name=None)[source]¶
Computes hyperbolic sine of x element-wise.
Given an input tensor, this function computes hyperbolic sine of every element in the tensor. Input range is [-inf,inf] and output range is [-inf,inf].
`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 2, 10, float("inf")]) tf.math.sinh(x) ==> [-inf -4.0515420e+03 -5.2109528e-01 1.1752012e+00 1.5094614e+00 3.6268604e+00 1.1013232e+04 inf] `
- Args:
x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
- Returns:
A Tensor. Has the same type as x.
- d2l.tensorflow.size(a)¶
- d2l.tensorflow.stack(values, axis=0, name='stack')[source]¶
Stacks a list of rank-R tensors into one rank-(R+1) tensor.
See also tf.concat, tf.tile, tf.repeat.
Packs the list of tensors in values into a tensor with rank one higher than each tensor in values, by packing them along the axis dimension. Given a list of length N of tensors of shape (A, B, C);
if axis == 0 then the output tensor will have the shape (N, A, B, C). if axis == 1 then the output tensor will have the shape (A, N, B, C). Etc.
For example:
>>> x = tf.constant([1, 4]) >>> y = tf.constant([2, 5]) >>> z = tf.constant([3, 6]) >>> tf.stack([x, y, z]) <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[1, 4], [2, 5], [3, 6]], dtype=int32)> >>> tf.stack([x, y, z], axis=1) <tf.Tensor: shape=(2, 3), dtype=int32, numpy= array([[1, 2, 3], [4, 5, 6]], dtype=int32)>
This is the opposite of unstack. The numpy equivalent is np.stack
>>> np.array_equal(np.stack([x, y, z]), tf.stack([x, y, z])) True
- Args:
values: A list of Tensor objects with the same shape and type. axis: An int. The axis to stack along. Defaults to the first dimension.
Negative values wrap around, so the valid range is [-(R+1), R+1).
name: A name for this operation (optional).
- Returns:
output: A stacked Tensor with the same type as values.
- Raises:
ValueError: If axis is out of the range [-(R+1), R+1).
- d2l.tensorflow.tanh(x, name=None)[source]¶
Computes hyperbolic tangent of x element-wise.
Given an input tensor, this function computes hyperbolic tangent of every element in the tensor. Input range is [-inf, inf] and output range is [-1,1].
`python x = tf.constant([-float("inf"), -5, -0.5, 1, 1.2, 2, 3, float("inf")]) tf.math.tanh(x) ==> [-1. -0.99990916 -0.46211717 0.7615942 0.8336547 0.9640276 0.9950547 1.] `
- Args:
x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
- Returns:
A Tensor. Has the same type as x.
If x is a SparseTensor, returns SparseTensor(x.indices, tf.math.tanh(x.values, …), x.dense_shape)
- d2l.tensorflow.tensor(value, dtype=None, shape=None, name='Const')¶
Creates a constant tensor from a tensor-like object.
Note: All eager tf.Tensor values are immutable (in contrast to tf.Variable). There is nothing especially _constant_ about the value returned from tf.constant. This function it is not fundamentally different from tf.convert_to_tensor. The name tf.constant comes from the symbolic APIs (like tf.data or keras functional models) where the value is embeded in a Const node in the tf.Graph. tf.constant is useful for asserting that the value can be embedded that way.
If the argument dtype is not specified, then the type is inferred from the type of value.
>>> # Constant 1-D Tensor from a python list. >>> tf.constant([1, 2, 3, 4, 5, 6]) <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)> >>> # Or a numpy array >>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>> tf.constant(a) <tf.Tensor: shape=(2, 3), dtype=int64, numpy= array([[1, 2, 3], [4, 5, 6]])>
If dtype is specified the resulting tensor values are cast to the requested dtype.
>>> tf.constant([1, 2, 3, 4, 5, 6], dtype=tf.float64) <tf.Tensor: shape=(6,), dtype=float64, numpy=array([1., 2., 3., 4., 5., 6.])>
If shape is set, the value is reshaped to match. Scalars are expanded to fill the shape:
>>> tf.constant(0, shape=(2, 3)) <tf.Tensor: shape=(2, 3), dtype=int32, numpy= array([[0, 0, 0], [0, 0, 0]], dtype=int32)> >>> tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) <tf.Tensor: shape=(2, 3), dtype=int32, numpy= array([[1, 2, 3], [4, 5, 6]], dtype=int32)>
tf.constant has no effect if an eager Tensor is passed as the value, it even transmits gradients:
>>> v = tf.Variable([0.0]) >>> with tf.GradientTape() as g: ... loss = tf.constant(v + v) >>> g.gradient(loss, v).numpy() array([2.], dtype=float32)
But, since tf.constant embeds the value in the tf.Graph this fails for symbolic tensors:
>>> i = tf.keras.layers.Input(shape=[None, None]) >>> t = tf.constant(i) Traceback (most recent call last): ... NotImplementedError: ...
tf.constant will _always_ create CPU (host) tensors. In order to create tensors on other devices, use tf.identity. (If the value is an eager Tensor, however, the tensor will be returned unmodified as mentioned above.)
Related Ops:
tf.convert_to_tensor is similar but: * It has no shape argument. * Symbolic tensors are allowed to pass through.
>>> i = tf.keras.layers.Input(shape=[None, None]) >>> t = tf.convert_to_tensor(i)
tf.fill: differs in a few ways: * tf.constant supports arbitrary constants, not just uniform scalar
Tensors like tf.fill.
tf.fill creates an Op in the graph that is expanded at runtime, so it can efficiently represent large tensors.
Since tf.fill does not embed the value, it can produce dynamically sized outputs.
- Args:
value: A constant value (or list) of output type dtype. dtype: The type of the elements of the resulting tensor. shape: Optional dimensions of resulting tensor. name: Optional name for the tensor.
- Returns:
A Constant Tensor.
- Raises:
TypeError: if shape is incorrectly specified or unsupported. ValueError: if called on a symbolic tensor.
- d2l.tensorflow.tokenize(lines, token='word')[source]¶
Split text lines into word or character tokens.
- d2l.tensorflow.train_2d(trainer, steps=20)[source]¶
Optimize a 2-dim objective function with a customized trainer.
- d2l.tensorflow.train_ch11(trainer_fn, states, hyperparams, data_iter, feature_dim, num_epochs=2)[source]¶
- d2l.tensorflow.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)[source]¶
Train a model (defined in Chapter 3).
- d2l.tensorflow.train_ch6(net_fn, train_iter, test_iter, num_epochs, lr, device)[source]¶
Train a model with a GPU (defined in Chapter 6).
- d2l.tensorflow.train_ch8(net, train_iter, vocab, num_hiddens, lr, num_epochs, strategy, use_random_iter=False)[source]¶
Train a model (defined in Chapter 8).
- d2l.tensorflow.train_epoch_ch3(net, train_iter, loss, updater)[source]¶
The training loop defined in Chapter 3.
- d2l.tensorflow.train_epoch_ch8(net, train_iter, loss, updater, params, use_random_iter)[source]¶
Train a model within one epoch (defined in Chapter 8).
- d2l.tensorflow.transpose(a, perm=None, conjugate=False, name='transpose')[source]¶
Transposes a, where a is a Tensor.
Permutes the dimensions according to the value of perm.
The returned tensor’s dimension i will correspond to the input dimension perm[i]. If perm is not given, it is set to (n-1…0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors.
If conjugate is True and a.dtype is either complex64 or complex128 then the values of a are conjugated and transposed.
@compatibility(numpy) In numpy transposes are memory-efficient constant time operations as they simply return a new view of the same data with adjusted strides.
TensorFlow does not support strides, so transpose returns a new tensor with the items permuted. @end_compatibility
For example:
>>> x = tf.constant([[1, 2, 3], [4, 5, 6]]) >>> tf.transpose(x) <tf.Tensor: shape=(3, 2), dtype=int32, numpy= array([[1, 4], [2, 5], [3, 6]], dtype=int32)>
Equivalently, you could call tf.transpose(x, perm=[1, 0]).
If x is complex, setting conjugate=True gives the conjugate transpose:
>>> x = tf.constant([[1 + 1j, 2 + 2j, 3 + 3j], ... [4 + 4j, 5 + 5j, 6 + 6j]]) >>> tf.transpose(x, conjugate=True) <tf.Tensor: shape=(3, 2), dtype=complex128, numpy= array([[1.-1.j, 4.-4.j], [2.-2.j, 5.-5.j], [3.-3.j, 6.-6.j]])>
‘perm’ is more useful for n-dimensional tensors where n > 2:
>>> x = tf.constant([[[ 1, 2, 3], ... [ 4, 5, 6]], ... [[ 7, 8, 9], ... [10, 11, 12]]])
As above, simply calling tf.transpose will default to perm=[2,1,0].
To take the transpose of the matrices in dimension-0 (such as when you are transposing matrices where 0 is the batch dimesnion), you would set perm=[0,2,1].
>>> tf.transpose(x, perm=[0, 2, 1]) <tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy= array([[[ 1, 4], [ 2, 5], [ 3, 6]], [[ 7, 10], [ 8, 11], [ 9, 12]]], dtype=int32)>
Note: This has a shorthand linalg.matrix_transpose):
- Args:
a: A Tensor. perm: A permutation of the dimensions of a. This should be a vector. conjugate: Optional bool. Setting it to True is mathematically equivalent
to tf.math.conj(tf.transpose(input)).
name: A name for the operation (optional).
- Returns:
A transposed Tensor.
- d2l.tensorflow.zeros(shape, dtype=tf.float32, name=None)[source]¶
Creates a tensor with all elements set to zero.
See also tf.zeros_like, tf.ones, tf.fill, tf.eye.
This operation returns a tensor of type dtype with shape shape and all elements set to zero.
>>> tf.zeros([3, 4], tf.int32) <tf.Tensor: shape=(3, 4), dtype=int32, numpy= array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int32)>
- Args:
- shape: A list of integers, a tuple of integers, or
a 1-D Tensor of type int32.
dtype: The DType of an element in the resulting Tensor. name: Optional string. A name for the operation.
- Returns:
A Tensor with all elements set to zero.