19.7. Documento da API `d2l`¶

Open the notebook in Colab

Open the notebook in Colab

Open the notebook in Colab

Open the notebook in SageMaker Studio Lab

As implementações dos seguintes membros do pacote d2l e seções onde eles são definidos e explicados podem ser encontrados no arquivo fonte.

mxnet pytorch tensorflow

class d2l.mxnet.Accumulator(n)[source]¶

Bases: object

For accumulating sums over n variables.

add(*args)[source]¶

reset()[source]¶

class d2l.mxnet.AddNorm(dropout, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(X, Y)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.AdditiveAttention(num_hiddens, dropout, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

Additive attention.

forward(queries, keys, values, valid_lens)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶

Bases: object

For plotting data in animation.

add(x, y)[source]¶

class d2l.mxnet.AttentionDecoder(**kwargs)[source]¶

Bases: d2l.mxnet.Decoder

The base attention-based decoder interface.

property attention_weights¶

class d2l.mxnet.BERTEncoder(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(tokens, segments, valid_lens)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.BERTModel(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000)[source]¶

Bases: mxnet.gluon.block.Block

forward(tokens, segments, valid_lens=None, pred_positions=None)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.BPRLoss(weight=None, batch_axis=0, **kwargs)[source]¶

Bases: mxnet.gluon.loss.Loss

forward(positive, negative)[source]¶: Defines the forward computation. Arguments can be either NDArray or Symbol.

class d2l.mxnet.BananasDataset(is_train)[source]¶: Bases: mxnet.gluon.data.dataset.Dataset

class d2l.mxnet.Benchmark(description='Done')[source]¶: Bases: object

class d2l.mxnet.CTRDataset(data_path, feat_mapper=None, defaults=None, min_threshold=4, num_feat=34)[source]¶: Bases: mxnet.gluon.data.dataset.Dataset

class d2l.mxnet.Decoder(**kwargs)[source]¶

Bases: mxnet.gluon.block.Block

The base decoder interface for the encoder-decoder architecture.

forward(X, state)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

init_state(enc_outputs, *args)[source]¶

class d2l.mxnet.DotProductAttention(dropout, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

Scaled dot product attention.

forward(queries, keys, values, valid_lens=None)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.Encoder(**kwargs)[source]¶

Bases: mxnet.gluon.block.Block

The base encoder interface for the encoder-decoder architecture.

forward(X, *args)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.EncoderBlock(num_hiddens, ffn_num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(X, valid_lens)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.EncoderDecoder(encoder, decoder, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

The base class for the encoder-decoder architecture.

forward(enc_X, dec_X, *args)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.HingeLossbRec(weight=None, batch_axis=0, **kwargs)[source]¶

Bases: mxnet.gluon.loss.Loss

forward(positive, negative, margin=1)[source]¶: Defines the forward computation. Arguments can be either NDArray or Symbol.

class d2l.mxnet.MaskLM(vocab_size, num_hiddens, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(X, pred_positions)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.MaskedSoftmaxCELoss(axis=- 1, sparse_label=True, from_logits=False, weight=None, batch_axis=0, **kwargs)[source]¶

Bases: mxnet.gluon.loss.SoftmaxCrossEntropyLoss

The softmax cross-entropy loss with masks.

forward(pred, label, valid_len)[source]¶: Defines the forward computation. Arguments can be either NDArray or Symbol.

class d2l.mxnet.MultiHeadAttention(num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(queries, keys, values, valid_lens)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.NextSentencePred(**kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(X)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.PositionWiseFFN(ffn_num_hiddens, ffn_num_outputs, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

forward(X)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.PositionalEncoding(num_hiddens, dropout, max_len=1000)[source]¶

Bases: mxnet.gluon.block.Block

forward(X)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.RNNModel(rnn_layer, vocab_size, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

The RNN model.

begin_state(*args, **kwargs)[source]¶

forward(inputs, state)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.RNNModelScratch(vocab_size, num_hiddens, device, get_params, init_state, forward_fn)[source]¶

Bases: object

An RNN Model implemented from scratch.

begin_state(batch_size, ctx)[source]¶

class d2l.mxnet.RandomGenerator(sampling_weights)[source]¶

Bases: object

Draw a random int in [0, n] according to n sampling weights.

draw()[source]¶

class d2l.mxnet.Residual(num_channels, use_1x1conv=False, strides=1, **kwargs)[source]¶

Bases: mxnet.gluon.block.Block

The Residual block of ResNet.

forward(X)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.SNLIDataset(dataset, num_steps, vocab=None)[source]¶

Bases: mxnet.gluon.data.dataset.Dataset

A customized dataset to load the SNLI dataset.

class d2l.mxnet.Seq2SeqEncoder(vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs)[source]¶

Bases: d2l.mxnet.Encoder

The RNN encoder for sequence to sequence learning.

forward(X, *args)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶

Bases: object

An iterator to load sequence data.

class d2l.mxnet.Timer[source]¶

Bases: object

Record multiple running times.

avg()[source]¶: Return the average time.

cumsum()[source]¶: Return the accumulated time.

start()[source]¶: Start the timer.

stop()[source]¶: Stop the timer and record the time in a list.

sum()[source]¶: Return the sum of time.

class d2l.mxnet.TokenEmbedding(embedding_name)[source]¶

Bases: object

Token Embedding.

class d2l.mxnet.TransformerEncoder(vocab_size, num_hiddens, ffn_num_hiddens, num_heads, num_layers, dropout, use_bias=False, **kwargs)[source]¶

Bases: d2l.mxnet.Encoder

forward(X, valid_lens, *args)[source]¶

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

*argslist of NDArray: Input tensors.

class d2l.mxnet.VOCSegDataset(is_train, crop_size, voc_dir)[source]¶

Bases: mxnet.gluon.data.dataset.Dataset

A customized dataset to load VOC dataset.

filter(imgs)[source]¶

Returns a new dataset with samples filtered by the filter function fn.

Note that if the Dataset is the result of a lazily transformed one with transform(lazy=False), the filter is eagerly applied to the transformed samples without materializing the transformed result. That is, the transformation will be applied again whenever a sample is retrieved after filter().

fncallable: A filter function that takes a sample as input and returns a boolean. Samples that return False are discarded.

Dataset: The filtered dataset.

normalize_image(img)[source]¶

class d2l.mxnet.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶

Bases: object

Vocabulary for text.

to_tokens(indices)[source]¶

d2l.mxnet.abs(x, out=None, **kwargs)¶

Calculate the absolute value element-wise.

xndarray or scalar: Input array.
outndarray or None, optional: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned.

absolutendarray: An ndarray containing the absolute value of each element in x. This is a scalar if x is a scalar.

>>> x = np.array([-1.2, 1.2])
>>> np.abs(x)
array([1.2, 1.2])

d2l.mxnet.accuracy(y_hat, y)[source]¶: Compute the number of correct predictions.

d2l.mxnet.annotate(text, xy, xytext)[source]¶

d2l.mxnet.arange(start, stop=None, step=1, dtype=None, ctx=None)¶

Return evenly spaced values within a given interval.

Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.

startnumber, optional: Start of interval. The interval includes this value. The default start value is 0.
stopnumber: End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.
stepnumber, optional: Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified as a position argument, start must also be given.
dtypedtype: The type of the output array. The default is float32.

arangendarray

Array of evenly spaced values.

For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.

>>> np.arange(3)
array([0., 1., 2.])

>>> np.arange(3.0)
array([0., 1., 2.])

>>> np.arange(3,7)
array([3., 4., 5., 6.])

>>> np.arange(3,7,2)
array([3., 5.])

d2l.mxnet.argmax(x, *args, **kwargs)¶

d2l.mxnet.astype(x, *args, **kwargs)¶

d2l.mxnet.batchify(data)[source]¶

d2l.mxnet.bbox_to_rect(bbox, color)[source]¶: Convert bounding box to matplotlib format.

d2l.mxnet.bleu(pred_seq, label_seq, k)[source]¶: Compute the BLEU.

d2l.mxnet.box_center_to_corner(boxes)[source]¶: Convert from (center, width, height) to (upper_left, bottom_right)

d2l.mxnet.box_corner_to_center(boxes)[source]¶: Convert from (upper_left, bottom_right) to (center, width, height)

d2l.mxnet.box_iou(boxes1, boxes2)[source]¶: Compute IOU between two sets of boxes of shape (N,4) and (M,4).

d2l.mxnet.build_array_nmt(lines, vocab, num_steps)[source]¶: Transform text sequences of machine translation into minibatches.

d2l.mxnet.build_colormap2label()[source]¶: Build an RGB color to label mapping for segmentation.

d2l.mxnet.concat(seq, axis=0, out=None)¶

Join a sequence of arrays along an existing axis.

a1, a2, …sequence of array_like: The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
axisint, optional: The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0.
outndarray, optional: If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.

resndarray: The concatenated array.

split : Split array into a list of multiple sub-arrays of equal size. hsplit : Split array into multiple sub-arrays horizontally (column wise) vsplit : Split array into multiple sub-arrays vertically (row wise) dsplit : Split array into multiple sub-arrays along the 3rd axis (depth). stack : Stack a sequence of arrays along a new axis. hstack : Stack arrays in sequence horizontally (column wise) vstack : Stack arrays in sequence vertically (row wise) dstack : Stack arrays in sequence depth wise (along third dimension)

>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6]])
>>> np.concatenate((a, b), axis=0)
array([[1., 2.],
       [3., 4.],
       [5., 6.]])

>>> np.concatenate((a, b.T), axis=1)
array([[1., 2., 5.],
       [3., 4., 6.]])

>>> np.concatenate((a, b), axis=None)
array([1., 2., 3., 4., 5., 6.])

d2l.mxnet.copyfile(filename, target_dir)[source]¶: Copy a file into a target directory.

d2l.mxnet.corr2d(X, K)[source]¶: Compute 2D cross-correlation.

d2l.mxnet.cos(x, out=None, **kwargs)¶

Cosine, element-wise.

xndarray or scalar: Angle, in radians (\(2 \pi\) rad equals 360 degrees).
outndarray or None: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.

yndarray or scalar: The corresponding cosine values. This is a scalar if x is a scalar.

This function only supports input type of float.

>>> np.cos(np.array([0, np.pi/2, np.pi]))
array([ 1.000000e+00, -4.371139e-08, -1.000000e+00])
>>> # Example of providing the optional output parameter
>>> out1 = np.array([0], dtype='f')
>>> out2 = np.cos(np.array([0.1]), out1)
>>> out2 is out1
True

d2l.mxnet.cosh(x, out=None, **kwargs)¶

Hyperbolic cosine, element-wise. Equivalent to 1/2 * (np.exp(x) + np.exp(-x)) and np.cos(1j*x).

xndarray or scalar: Input array or scalar.
outndarray or None: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.

yndarray or scalar: The corresponding hyperbolic cosine values. This is a scalar if x is a scalar.

This function only supports input type of float.

>>> np.cosh(0)
1.0

d2l.mxnet.count_corpus(tokens)[source]¶: Count token frequencies.

class d2l.mxnet.defaultdict¶

Bases: dict

defaultdict(default_factory[, …]) –> dict with default factory

The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.

copy() → a shallow copy of D.¶

default_factory¶: Factory for default value called by __missing__().

d2l.mxnet.download(name, cache_dir='../data')[source]¶: Download a file inserted into DATA_HUB, return the local filename.

d2l.mxnet.download_all()[source]¶: Download all files in the DATA_HUB.

d2l.mxnet.download_extract(name, folder=None)[source]¶: Download and extract a zip/tar file.

d2l.mxnet.evaluate_accuracy(net, data_iter)[source]¶: Compute the accuracy for a model on a dataset.

d2l.mxnet.evaluate_accuracy_gpu(net, data_iter, device=None)[source]¶: Compute the accuracy for a model on a dataset using a GPU.

d2l.mxnet.evaluate_accuracy_gpus(net, data_iter, split_f=<function split_batch>)[source]¶

d2l.mxnet.evaluate_loss(net, data_iter, loss)[source]¶: Evaluate the loss of a model on the given dataset.

d2l.mxnet.evaluate_ranking(net, test_input, seq, candidates, num_users, num_items, devices)[source]¶

d2l.mxnet.exp(x, out=None, **kwargs)¶

Calculate the exponential of all elements in the input array.

xndarray or scalar: Input values.
outndarray or None, optional: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned.

outndarray or scalar: Output array, element-wise exponential of x. This is a scalar if x is a scalar.

>>> np.exp(1)
2.718281828459045
>>> x = np.array([-1, 1, -2, 2])
>>> np.exp(x)
array([0.36787945, 2.7182817 , 0.13533528, 7.389056  ])

d2l.mxnet.eye(N, M=None, k=0, dtype=<class 'numpy.float32'>, **kwargs)¶

Return a 2-D array with ones on the diagonal and zeros elsewhere.

Nint: Number of rows in the output.
Mint, optional: Number of columns in the output. If None, defaults to N.
kint, optional: Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.
dtypedata-type, optional: Data-type of the returned array.

Indarray of shape (N,M): An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.

>>> np.eye(2, dtype=int)
array([[1, 0],
       [0, 1]], dtype=int64)
>>> np.eye(3, k=1)
array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

class d2l.mxnet.float32¶

Bases: numpy.floating

Single-precision floating-point number type, compatible with C float. Character code: 'f'. Canonical name: np.single. Alias on this platform: np.float32: 32-bit-precision floating-point number type: sign bit, 8 bits exponent, 23 bits mantissa.

as_integer_ratio()¶

Return a pair of integers, whose ratio is exactly equal to the original floating point number, and with a positive denominator. Raise OverflowError on infinities and a ValueError on NaNs.

>>> np.single(10.0).as_integer_ratio()
(10, 1)
>>> np.single(0.0).as_integer_ratio()
(0, 1)
>>> np.single(-.25).as_integer_ratio()
(-1, 4)

d2l.mxnet.get_centers_and_contexts(corpus, max_window_size)[source]¶

d2l.mxnet.get_data_ch11(batch_size=10, n=1500)[source]¶

d2l.mxnet.get_dataloader_workers()[source]¶: Use 4 processes to read the data except for Windows.

d2l.mxnet.get_fashion_mnist_labels(labels)[source]¶: Return text labels for the Fashion-MNIST dataset.

d2l.mxnet.get_negatives(all_contexts, corpus, K)[source]¶

d2l.mxnet.get_tokens_and_segments(tokens_a, tokens_b=None)[source]¶

d2l.mxnet.grad_clipping(net, theta)[source]¶: Clip the gradient.

d2l.mxnet.hit_and_auc(rankedlist, test_matrix, k)[source]¶

class d2l.mxnet.int32¶

Bases: numpy.signedinteger

Signed integer type, compatible with C int. Character code: 'i'. Canonical name: np.intc. Alias on this platform: np.int32: 32-bit signed integer (-2147483648 to 2147483647).

d2l.mxnet.linreg(X, w, b)[source]¶: The linear regression model.

d2l.mxnet.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0, ctx=None)¶

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval [start, stop]. The endpoint of the interval can optionally be excluded.

startreal number: The starting value of the sequence.
stopreal number: The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.
numint, optional: Number of samples to generate. Default is 50. Must be non-negative.
endpointbool, optional: If True, stop is the last sample. Otherwise, it is not included. Default is True.
retstepbool, optional: If True, return (samples, step), where step is the spacing between samples.
dtypedtype, optional: The type of the output array. If dtype is not given, infer the data type from the other input arguments.
axisint, optional: The axis in the result to store the samples. Relevant only if start or stop are array-like. By default (0), the samples will be along a new axis inserted at the beginning. Use -1 to get an axis at the end.

samplesndarray: There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).
stepfloat, optional: Only returned if retstep is True Size of spacing between samples.

arangeSimilar to linspace, but uses a step size (instead of the: number of samples).

>>> np.linspace(2.0, 3.0, num=5)
array([2.  , 2.25, 2.5 , 2.75, 3.  ])
>>> np.linspace(2.0, 3.0, num=5, endpoint=False)
array([2. , 2.2, 2.4, 2.6, 2.8])
>>> np.linspace(2.0, 3.0, num=5, retstep=True)
(array([2.  , 2.25, 2.5 , 2.75, 3.  ]), 0.25)

Graphical illustration:

>>> import matplotlib.pyplot as plt
>>> N = 8
>>> y = np.zeros(N)
>>> x1 = np.linspace(0, 10, N, endpoint=True)
>>> x2 = np.linspace(0, 10, N, endpoint=False)
>>> plt.plot(x1.asnumpy(), y.asnumpy(), 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.plot(x2.asnumpy(), (y + 0.5).asnumpy(), 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.ylim([-0.5, 1])
(-0.5, 1)
>>> plt.show()

This function differs from the original numpy.linspace in the following aspects:

start and stop do not support list, numpy ndarray and mxnet ndarray
axis could only be 0
There could be an additional ctx argument to specify the device, e.g. the i-th GPU.

d2l.mxnet.load_array(data_arrays, batch_size, is_train=True)[source]¶: Construct a Gluon data iterator.

d2l.mxnet.load_corpus_time_machine(max_tokens=- 1)[source]¶: Return token indices and the vocabulary of the time machine dataset.

d2l.mxnet.load_data_bananas(batch_size)[source]¶: Load the bananas dataset.

d2l.mxnet.load_data_fashion_mnist(batch_size, resize=None)[source]¶: Download the Fashion-MNIST dataset and then load it into memory.

d2l.mxnet.load_data_imdb(batch_size, num_steps=500)[source]¶

d2l.mxnet.load_data_ml100k(data, num_users, num_items, feedback='explicit')[source]¶

d2l.mxnet.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶: Return the iterator and the vocabularies of the translation dataset.

d2l.mxnet.load_data_ptb(batch_size, max_window_size, num_noise_words)[source]¶

d2l.mxnet.load_data_snli(batch_size, num_steps=50)[source]¶: Download the SNLI dataset and return data iterators and vocabulary.

d2l.mxnet.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶: Return the iterator and the vocabulary of the time machine dataset.

d2l.mxnet.load_data_voc(batch_size, crop_size)[source]¶: Download and load the VOC2012 semantic dataset.

d2l.mxnet.load_data_wiki(batch_size, max_len)[source]¶

d2l.mxnet.log(x, out=None, **kwargs)¶

Natural logarithm, element-wise. The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.

xndarray: Input value. Elements must be of real value.
outndarray or None, optional: A location into which the result is stored. If provided, it must have the same shape and dtype as input ndarray. If not provided or None, a freshly-allocated array is returned.

yndarray: The natural logarithm of x, element-wise. This is a scalar if x is a scalar.

Currently only supports data of real values and inf as input. Returns data of real value, inf, -inf and nan according to the input. This function differs from the original numpy.log in the following aspects: - Does not support complex number for now - Input type does not support Python native iterables(list, tuple, …). - out param: cannot perform auto broadcasting. out ndarray’s shape must be the same as the expected output. - out param: cannot perform auto type cast. out ndarray’s dtype must be the same as the expected output. - out param does not support scalar input case.

>>> a = np.array([1, np.exp(1), np.exp(2), 0], dtype=np.float64)
>>> np.log(a)
array([  0.,   1.,   2., -inf], dtype=float64)
>>> # Using the default float32 dtype leads to slightly different behavior
>>> a = np.array([1, np.exp(1), np.exp(2), 0])
>>> np.log(a)
array([  0.,  0.99999994,   2., -inf])
>>> np.log(1)
0.0

d2l.mxnet.masked_softmax(X, valid_lens)[source]¶: Perform softmax operation by masking elements on the last axis.

d2l.mxnet.match_anchor_to_bbox(ground_truth, anchors, device, iou_threshold=0.5)[source]¶: Assign ground-truth bounding boxes to anchor boxes similar to them.

d2l.mxnet.matmul(a, b, out=None)¶

Dot product of two arrays. Specifically,

If both a and b are 1-D arrays, it is inner product of vectors
If both a and b are 2-D arrays, it is matrix multiplication,
If either a or b is 0-D (scalar), it is equivalent to multiply() and using np.multiply(a, b) or a * b is preferred.
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is a 2-D array, it is a sum product over the last axis of a and the second-to-last axis of b:
```
dot(a, b)[i,j,k] = sum(a[i,j,:] * b[:,k])
```

andarray: First argument.
bndarray: Second argument.
outndarray, optional: Output argument. It must have the same shape and type as the expected output.

outputndarray: Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned

>>> a = np.array(3)
>>> b = np.array(4)
>>> np.dot(a, b)
array(12.)

For 2-D arrays it is the matrix product:

>>> a = np.array([[1, 0], [0, 1]])
>>> b = np.array([[4, 1], [2, 2]])
>>> np.dot(a, b)
array([[4., 1.],
       [2., 2.]])

>>> a = np.arange(3*4*5*6).reshape((3,4,5,6))
>>> b = np.arange(5*6)[::-1].reshape((6,5))
>>> np.dot(a, b)[2,3,2,2]
array(29884.)
>>> np.sum(a[2,3,2,:] * b[:,2])
array(29884.)

d2l.mxnet.meshgrid(*xi, **kwargs)[source]¶

Return coordinate matrices from coordinate vectors.

Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.

x1, x2,…, xnndarrays: 1-D arrays representing the coordinates of a grid.
indexing{‘xy’, ‘ij’}, optional: Cartesian (‘xy’, default) or matrix (‘ij’) indexing of output. See Notes for more details.
sparsebool, optional: If True a sparse grid is returned in order to conserve memory. Default is False. Please note that sparse=True is currently not supported.
copybool, optional: If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that copy=False is currently not supported.

X1, X2,…, XNndarray: For vectors x1, x2,…, ‘xn’ with lengths Ni=len(xi) , return (N1, N2, N3,...Nn) shaped arrays if indexing=’ij’ or (N2, N1, N3,...Nn) shaped arrays if indexing=’xy’ with the elements of xi repeated to fill the matrix along the first dimension for x1, the second for x2 and so on.

This function supports both indexing conventions through the indexing keyword argument. Giving the string ‘ij’ returns a meshgrid with matrix indexing, while ‘xy’ returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for ‘xy’ indexing and (M, N) for ‘ij’ indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for ‘xy’ indexing and (M, N, P) for ‘ij’ indexing. The difference is illustrated by the following code snippet:

xv, yv = np.meshgrid(x, y, sparse=False, indexing='ij')
for i in range(nx):
    for j in range(ny):
        # treat xv[i,j], yv[i,j]

xv, yv = np.meshgrid(x, y, sparse=False, indexing='xy')
for i in range(nx):
    for j in range(ny):
        # treat xv[j,i], yv[j,i]

In the 1-D and 0-D case, the indexing and sparse keywords have no effect.

d2l.mxnet.multibox_detection(cls_probs, offset_preds, anchors, nms_threshold=0.5, pos_threshold=0.00999999978)[source]¶

d2l.mxnet.multibox_prior(data, sizes, ratios)[source]¶

d2l.mxnet.multibox_target(anchors, labels)[source]¶

d2l.mxnet.nms(boxes, scores, iou_threshold)[source]¶

d2l.mxnet.normal(loc=0.0, scale=1.0, size=None, dtype=None, ctx=None, out=None)[source]¶

Draw random samples from a normal (Gaussian) distribution.

Samples are distributed according to a normal distribution parametrized by loc (mean) and scale (standard deviation).

locfloat, optional: Mean (centre) of the distribution.
scalefloat, optional: Standard deviation (spread or “width”) of the distribution.
sizeint or tuple of ints, optional: Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a scalar tensor containing a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(low, high).size samples are drawn.
dtype{‘float16’, ‘float32’, ‘float64’}, optional: Data type of output samples. Default is ‘float32’
ctxContext, optional: Device context of output, default is current context.
outndarray, optional: Store output to an existing ndarray.

outndarray: Drawn samples from the parameterized normal distribution.

The probability density for the Gaussian distribution is

(19.7.1)¶\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]

where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.

The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x - \sigma\) 2). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.

1: Wikipedia, “Normal distribution”, https://en.wikipedia.org/wiki/Normal_distribution
2: P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.

>>> mu, sigma = 0, 0.1 # mean and standard deviation
>>> s = np.random.normal(mu, sigma, 1000)

Verify the mean and the variance:

>>> np.abs(mu - np.mean(s)) < 0.01
array(True)

d2l.mxnet.numpy(x, *args, **kwargs)¶

d2l.mxnet.offset_boxes(anchors, assigned_bb, eps=1e-06)[source]¶

d2l.mxnet.offset_inverse(anchors, offset_preds)[source]¶

d2l.mxnet.ones(shape, dtype=<class 'numpy.float32'>, order='C', ctx=None)¶

Return a new array of given shape and type, filled with ones. This function currently only supports storing multi-dimensional data in row-major (C-style).

shapeint or tuple of int: The shape of the empty array.
dtypestr or numpy.dtype, optional: An optional value type. Default is numpy.float32. Note that this behavior is different from NumPy’s ones function where float64 is the default value, because float32 is considered as the default data type in deep learning.
order{‘C’}, optional, default: ‘C’: How to store multi-dimensional data in memory, currently only row-major (C-style) is supported.
ctxContext, optional: An optional device context (default is the current default context).

outndarray: Array of ones with the given shape, dtype, and ctx.

>>> np.ones(5)
array([1., 1., 1., 1., 1.])

>>> np.ones((5,), dtype=int)
array([1, 1, 1, 1, 1], dtype=int64)

>>> np.ones((2, 1))
array([[1.],
       [1.]])

>>> s = (2,2)
>>> np.ones(s)
array([[1., 1.],
       [1., 1.]])

d2l.mxnet.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶: Plot data points.

d2l.mxnet.predict_ch3(net, test_iter, n=6)[source]¶: Predict labels (defined in Chapter 3).

d2l.mxnet.predict_ch8(prefix, num_preds, net, vocab, device)[source]¶: Generate new characters following the prefix.

d2l.mxnet.predict_sentiment(net, vocab, sentence)[source]¶

d2l.mxnet.predict_seq2seq(net, src_sentence, src_vocab, tgt_vocab, num_steps, device, save_attention_weights=False)[source]¶: Predict for sequence to sequence.

d2l.mxnet.predict_snli(net, vocab, premise, hypothesis)[source]¶

d2l.mxnet.preprocess_nmt(text)[source]¶: Preprocess the English-French dataset.

d2l.mxnet.rand(*size, **kwargs)[source]¶

Random values in a given shape.

Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1). Parameters ———- d0, d1, …, dn : int, optional

The dimensions of the returned array, should be all positive. If no argument is given a single Python float is returned.

outndarray: Random values.

>>> np.random.rand(3,2)
array([[ 0.14022471,  0.96360618],  #random
       [ 0.37601032,  0.25528411],  #random
       [ 0.49313049,  0.94909878]]) #random

d2l.mxnet.read_csv_labels(fname)[source]¶: Read fname to return a name to label dictionary.

d2l.mxnet.read_data_bananas(is_train=True)[source]¶: Read the bananas dataset images and labels.

d2l.mxnet.read_data_ml100k()[source]¶

d2l.mxnet.read_data_nmt()[source]¶: Load the English-French dataset.

d2l.mxnet.read_imdb(data_dir, is_train)[source]¶

d2l.mxnet.read_ptb()[source]¶

d2l.mxnet.read_snli(data_dir, is_train)[source]¶: Read the SNLI dataset into premises, hypotheses, and labels.

d2l.mxnet.read_time_machine()[source]¶: Load the time machine dataset into a list of text lines.

d2l.mxnet.read_voc_images(voc_dir, is_train=True)[source]¶: Read all VOC feature and label images.

d2l.mxnet.reduce_sum(x, *args, **kwargs)¶

d2l.mxnet.reorg_test(data_dir)[source]¶

d2l.mxnet.reorg_train_valid(data_dir, labels, valid_ratio)[source]¶

d2l.mxnet.reshape(x, *args, **kwargs)¶

d2l.mxnet.resnet18(num_classes)[source]¶: A slightly modified ResNet-18 model.

d2l.mxnet.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using random sampling.

d2l.mxnet.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using sequential partitioning.

d2l.mxnet.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶: Set the axes for matplotlib.

d2l.mxnet.set_figsize(figsize=(3.5, 2.5))[source]¶: Set the figure size for matplotlib.

d2l.mxnet.sgd(params, lr, batch_size)[source]¶: Minibatch stochastic gradient descent.

d2l.mxnet.show_bboxes(axes, bboxes, labels=None, colors=None)[source]¶: Show bounding boxes.

d2l.mxnet.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶

d2l.mxnet.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶: Plot a list of images.

d2l.mxnet.show_trace_2d(f, results)[source]¶: Show the trace of 2D variables during optimization.

d2l.mxnet.sin(x, out=None, **kwargs)¶

Trigonometric sine, element-wise.

xndarray or scalar: Angle, in radians (\(2 \pi\) rad equals 360 degrees).
outndarray or None: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.

yndarray or scalar: The sine of each element of x. This is a scalar if x is a scalar.

This function only supports input type of float.

>>> np.sin(np.pi/2.)
1.0
>>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180.)
array([0.        , 0.5       , 0.70710677, 0.86602545, 1.        ])

d2l.mxnet.sinh(x, out=None, **kwargs)¶

Hyperbolic sine, element-wise. Equivalent to 1/2 * (np.exp(x) - np.exp(-x)) or -1j * np.sin(1j*x).

xndarray or scalar: Input array or scalar.
outndarray or None: A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. The dtype of the output is the same as that of the input if the input is an ndarray.

yndarray or scalar: The corresponding hyperbolic sine values. This is a scalar if x is a scalar.

This function only supports input type of float.

>>> np.sinh(0)
0.0
>>> # Example of providing the optional output parameter
>>> out1 = np.array([0], dtype='f')
>>> out2 = np.sinh(np.array([0.1]), out1)
>>> out2 is out1
True

d2l.mxnet.size(a)¶

d2l.mxnet.split_and_load_ml100k(split_mode='seq-aware', feedback='explicit', test_ratio=0.1, batch_size=256)[source]¶

d2l.mxnet.split_batch(X, y, devices)[source]¶: Split X and y into multiple devices.

d2l.mxnet.split_batch_multi_inputs(X, y, devices)[source]¶: Split multi-input X and y into multiple devices.

d2l.mxnet.split_data_ml100k(data, num_users, num_items, split_mode='random', test_ratio=0.1)[source]¶: Split the dataset in random mode or seq-aware mode.

d2l.mxnet.squared_loss(y_hat, y)[source]¶: Squared loss.

d2l.mxnet.stack(arrays, axis=0, out=None)¶

Join a sequence of arrays along a new axis.: The axis parameter specifies the index of the new axis in the dimensions of the result. For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension.

arrayssequence of array_like: Each array must have the same shape.
axisint, optional: The axis in the result array along which the input arrays are stacked.
outndarray, optional: If provided, the destination to place the result. The shape must be correct, matching that of what stack would have returned if no out argument were specified.

stackedndarray: The stacked array has one more dimension than the input arrays.

concatenate : Join a sequence of arrays along an existing axis. split : Split array into a list of multiple sub-arrays of equal size.

>>> arrays = [np.random.rand(3, 4) for _ in range(10)]
>>> np.stack(arrays, axis=0).shape
(10, 3, 4)

>>> np.stack(arrays, axis=1).shape
(3, 10, 4)

>>> np.stack(arrays, axis=2).shape
(3, 4, 10)

>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.stack((a, b))
array([[1., 2., 3.],
       [2., 3., 4.]])

>>> np.stack((a, b), axis=-1)
array([[1., 2.],
       [2., 3.],
       [3., 4.]])

d2l.mxnet.subsampling(sentences, vocab)[source]¶

d2l.mxnet.synthetic_data(w, b, num_examples)[source]¶: Generate y = Xw + b + noise.

d2l.mxnet.tanh(x, out=None, **kwargs)¶

Compute hyperbolic tangent element-wise. Equivalent to np.sinh(x)/np.cosh(x).

xndarray or scalar.: Input array.
outndarray or None: A location into which the result is stored. If provided, it must have a shape that the inputs fill into. If not provided or None, a freshly-allocated array is returned. The dtype of the output and input must be the same.

yndarray or scalar: The corresponding hyperbolic tangent values.

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples) - input x does not support complex computation (like imaginary number) >>> np.tanh(np.pi*1j) TypeError: type <type ‘complex’> not supported

>>> np.tanh(np.array[0, np.pi]))
array([0.       , 0.9962721])
>>> np.tanh(np.pi)
0.99627207622075
>>> # Example of providing the optional output parameter illustrating
>>> # that what is returned is a reference to said parameter
>>> out1 = np.array(1)
>>> out2 = np.tanh(np.array(0.1), out1)
>>> out2 is out1
True

d2l.mxnet.tensor(object, dtype=None, ctx=None)¶

Create an array.

objectarray_like or numpy.ndarray or mxnet.numpy.ndarray: An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.
dtypedata-type, optional: The desired data-type for the array. Default is float32.
ctxdevice context, optional: Device context on which the memory is allocated. Default is mxnet.context.current_context().

outndarray: An array object satisfying the specified requirements.

>>> np.array([1, 2, 3])
array([1., 2., 3.])

>>> np.array([[1, 2], [3, 4]])
array([[1., 2.],
       [3., 4.]])

>>> np.array([[1, 0], [0, 1]], dtype=bool)
array([[ True, False],
       [False,  True]])

d2l.mxnet.to(x, *args, **kwargs)¶

d2l.mxnet.tokenize(lines, token='word')[source]¶: Split text lines into word or character tokens.

d2l.mxnet.tokenize_nmt(text, num_examples=None)[source]¶: Tokenize the English-French dataset.

d2l.mxnet.train_2d(trainer, steps=20)[source]¶: Optimize a 2-dim objective function with a customized trainer.

d2l.mxnet.train_batch_ch13(net, features, labels, loss, trainer, devices, split_f=<function split_batch>)[source]¶

d2l.mxnet.train_ch11(trainer_fn, states, hyperparams, data_iter, feature_dim, num_epochs=2)[source]¶

d2l.mxnet.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[gpu(0), gpu(1), gpu(2), gpu(3)], split_f=<function split_batch>)[source]¶

d2l.mxnet.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)[source]¶: Train a model (defined in Chapter 3).

d2l.mxnet.train_ch6(net, train_iter, test_iter, num_epochs, lr, device)[source]¶: Train a model with a GPU (defined in Chapter 6).

d2l.mxnet.train_ch8(net, train_iter, vocab, lr, num_epochs, device, use_random_iter=False)[source]¶: Train a model (defined in Chapter 8).

d2l.mxnet.train_concise_ch11(tr_name, hyperparams, data_iter, num_epochs=2)[source]¶

d2l.mxnet.train_epoch_ch3(net, train_iter, loss, updater)[source]¶: Train a model within one epoch (defined in Chapter 3).

d2l.mxnet.train_epoch_ch8(net, train_iter, loss, updater, device, use_random_iter)[source]¶: Train a model within one epoch (defined in Chapter 8).

d2l.mxnet.train_ranking(net, train_iter, test_iter, loss, trainer, test_seq_iter, num_users, num_items, num_epochs, devices, evaluator, candidates, eval_step=1)[source]¶

d2l.mxnet.train_recsys_rating(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[gpu(0), gpu(1), gpu(2), gpu(3)], evaluator=None, **kwargs)[source]¶

d2l.mxnet.train_seq2seq(net, data_iter, lr, num_epochs, tgt_vocab, device)[source]¶: Train a model for sequence to sequence.

d2l.mxnet.transpose(a)¶

d2l.mxnet.transpose_output(X, num_heads)[source]¶: Reverse the operation of transpose_qkv

d2l.mxnet.transpose_qkv(X, num_heads)[source]¶

d2l.mxnet.truncate_pad(line, num_steps, padding_token)[source]¶: Truncate or pad sequences.

d2l.mxnet.try_all_gpus()[source]¶: Return all available GPUs, or [cpu()] if no GPU exists.

d2l.mxnet.try_gpu(i=0)[source]¶: Return gpu(i) if exists, otherwise return cpu().

d2l.mxnet.update_D(X, Z, net_D, net_G, loss, trainer_D)[source]¶: Update discriminator.

d2l.mxnet.update_G(Z, net_D, net_G, loss, trainer_G)[source]¶: Update generator.

d2l.mxnet.use_svg_display()[source]¶: Use the svg format to display a plot in Jupyter.

d2l.mxnet.voc_label_indices(colormap, colormap2label)[source]¶: Map an RGB color to a label.

d2l.mxnet.voc_rand_crop(feature, label, height, width)[source]¶: Randomly crop for both feature and label images.

d2l.mxnet.zeros(shape, dtype=None, order='C', ctx=None)¶

Return a new array of given shape and type, filled with zeros. This function currently only supports storing multi-dimensional data in row-major (C-style).

shapeint or tuple of int: The shape of the empty array.
dtypestr or numpy.dtype, optional: An optional value type (default is numpy.float32). Note that this behavior is different from NumPy’s zeros function where float64 is the default value, because float32 is considered as the default data type in deep learning.
order{‘C’}, optional, default: ‘C’: How to store multi-dimensional data in memory, currently only row-major (C-style) is supported.
ctxContext, optional: An optional device context (default is the current default context).

outndarray: Array of zeros with the given shape, dtype, and ctx.

>>> np.zeros(5)
array([0., 0., 0., 0., 0.])

>>> np.zeros((5,), dtype=int)
array([0, 0, 0, 0, 0], dtype=int64)

>>> np.zeros((2, 1))
array([[0.],
       [0.]])

class d2l.torch.Accumulator(n)[source]¶

Bases: object

For accumulating sums over n variables.

add(*args)[source]¶

reset()[source]¶

class d2l.torch.AddNorm(normalized_shape, dropout, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(X, Y)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.AdditiveAttention(key_size, query_size, num_hiddens, dropout, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(queries, keys, values, valid_lens)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶

Bases: object

For plotting data in animation.

add(x, y)[source]¶

class d2l.torch.AttentionDecoder(**kwargs)[source]¶

Bases: d2l.torch.Decoder

The base attention-based decoder interface.

property attention_weights¶

training: bool¶

class d2l.torch.BERTEncoder(vocab_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, key_size=768, query_size=768, value_size=768, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(tokens, segments, valid_lens)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.BERTModel(vocab_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, max_len=1000, key_size=768, query_size=768, value_size=768, hid_in_features=768, mlm_in_features=768, nsp_in_features=768)[source]¶

Bases: torch.nn.modules.module.Module

forward(tokens, segments, valid_lens=None, pred_positions=None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.BananasDataset(is_train)[source]¶: Bases: torch.utils.data.dataset.Dataset

class d2l.torch.Benchmark(description='Done')[source]¶: Bases: object

class d2l.torch.Decoder(**kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The base decoder interface for the encoder-decoder architecture.

forward(X, state)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_state(enc_outputs, *args)[source]¶

training: bool¶

class d2l.torch.DotProductAttention(dropout, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

Scaled dot product attention.

forward(queries, keys, values, valid_lens=None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.Encoder(**kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The base encoder interface for the encoder-decoder architecture.

forward(X, *args)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.EncoderBlock(key_size, query_size, value_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, dropout, use_bias=False, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(X, valid_lens)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.EncoderDecoder(encoder, decoder, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The base class for the encoder-decoder architecture.

forward(enc_X, dec_X, *args)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.MaskLM(vocab_size, num_hiddens, num_inputs=768, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(X, pred_positions)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.MaskedSoftmaxCELoss(weight: Optional[torch.Tensor] = None, size_average=None, ignore_index: int = - 100, reduce=None, reduction: str = 'mean')[source]¶

Bases: torch.nn.modules.loss.CrossEntropyLoss

The softmax cross-entropy loss with masks.

forward(pred, label, valid_len)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

ignore_index: int¶

class d2l.torch.MultiHeadAttention(key_size, query_size, value_size, num_hiddens, num_heads, dropout, bias=False, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(queries, keys, values, valid_lens)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.NextSentencePred(num_inputs, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(X)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.PositionWiseFFN(ffn_num_input, ffn_num_hiddens, ffn_num_outputs, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

forward(X)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.PositionalEncoding(num_hiddens, dropout, max_len=1000)[source]¶

Bases: torch.nn.modules.module.Module

forward(X)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.RNNModel(rnn_layer, vocab_size, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The RNN model.

begin_state(device, batch_size=1)[source]¶

forward(inputs, state)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.RNNModelScratch(vocab_size, num_hiddens, device, get_params, init_state, forward_fn)[source]¶

Bases: object

A RNN Model implemented from scratch.

begin_state(batch_size, device)[source]¶

class d2l.torch.RandomGenerator(sampling_weights)[source]¶

Bases: object

Draw a random int in [0, n] according to n sampling weights.

draw()[source]¶

class d2l.torch.Residual(input_channels, num_channels, use_1x1conv=False, strides=1)[source]¶

Bases: torch.nn.modules.module.Module

The Residual block of ResNet.

forward(X)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.SNLIDataset(dataset, num_steps, vocab=None)[source]¶

Bases: torch.utils.data.dataset.Dataset

A customized dataset to load the SNLI dataset.

class d2l.torch.Seq2SeqEncoder(vocab_size, embed_size, num_hiddens, num_layers, dropout=0, **kwargs)[source]¶

Bases: d2l.torch.Encoder

The RNN encoder for sequence to sequence learning.

forward(X, *args)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶

Bases: object

An iterator to load sequence data.

class d2l.torch.Timer[source]¶

Bases: object

Record multiple running times.

avg()[source]¶: Return the average time.

cumsum()[source]¶: Return the accumulated time.

start()[source]¶: Start the timer.

stop()[source]¶: Stop the timer and record the time in a list.

sum()[source]¶: Return the sum of time.

class d2l.torch.TokenEmbedding(embedding_name)[source]¶

Bases: object

Token Embedding.

class d2l.torch.TransformerEncoder(vocab_size, key_size, query_size, value_size, num_hiddens, norm_shape, ffn_num_input, ffn_num_hiddens, num_heads, num_layers, dropout, use_bias=False, **kwargs)[source]¶

Bases: d2l.torch.Encoder

forward(X, valid_lens, *args)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class d2l.torch.VOCSegDataset(is_train, crop_size, voc_dir)[source]¶

Bases: torch.utils.data.dataset.Dataset

A customized dataset to load VOC dataset.

filter(imgs)[source]¶

normalize_image(img)[source]¶

class d2l.torch.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶

Bases: object

Vocabulary for text.

to_tokens(indices)[source]¶

d2l.torch.abs(input, *, out=None) → Tensor¶

Computes the absolute value of each element in input.

(19.7.2)¶\[\text{out}_{i} = |\text{input}_{i}|\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> torch.abs(torch.tensor([-1, -2, 3]))
tensor([ 1,  2,  3])

d2l.torch.accuracy(y_hat, y)[source]¶: Compute the number of correct predictions.

d2l.torch.annotate(text, xy, xytext)[source]¶

d2l.torch.arange(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a 1-D tensor of size \(\left\lceil \frac{\text{end} - \text{start}}{\text{step}} \right\rceil\) with values from the interval [start, end) taken with common difference step beginning from start.

Note that non-integer step is subject to floating point rounding errors when comparing against end; to avoid inconsistency, we advise adding a small epsilon to end in such cases.

(19.7.3)¶\[\text{out}_{{i+1}} = \text{out}_{i} + \text{step}\]

Args:

start (Number): the starting value for the set of points. Default: 0. end (Number): the ending value for the set of points step (Number): the gap between each pair of adjacent points. Default: 1.

Keyword args:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()). If dtype is not given, infer the data type from the other input arguments. If any of start, end, or stop are floating-point, the dtype is inferred to be the default dtype, see get_default_dtype(). Otherwise, the dtype is inferred to be torch.int64.

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.arange(5)
tensor([ 0,  1,  2,  3,  4])
>>> torch.arange(1, 4)
tensor([ 1,  2,  3])
>>> torch.arange(1, 2.5, 0.5)
tensor([ 1.0000,  1.5000,  2.0000])

d2l.torch.argmax(x, *args, **kwargs)¶

d2l.torch.astype(x, *args, **kwargs)¶

d2l.torch.batchify(data)[source]¶

d2l.torch.bbox_to_rect(bbox, color)[source]¶: Convert bounding box to matplotlib format.

d2l.torch.bleu(pred_seq, label_seq, k)[source]¶: Compute the BLEU.

d2l.torch.box_center_to_corner(boxes)[source]¶: Convert from (center, width, height) to (upper_left, bottom_right)

d2l.torch.box_corner_to_center(boxes)[source]¶: Convert from (upper_left, bottom_right) to (center, width, height)

d2l.torch.box_iou(boxes1, boxes2)[source]¶: Compute IOU between two sets of boxes of shape (N,4) and (M,4).

d2l.torch.build_array_nmt(lines, vocab, num_steps)[source]¶: Transform text sequences of machine translation into minibatches.

d2l.torch.build_colormap2label()[source]¶: Build an RGB color to label mapping for segmentation.

d2l.torch.concat()¶

cat(tensors, dim=0, *, out=None) -> Tensor

Concatenates the given sequence of seq tensors in the given dimension. All tensors must either have the same shape (except in the concatenating dimension) or be empty.

torch.cat() can be seen as an inverse operation for torch.split() and torch.chunk().

torch.cat() can be best understood via examples.

Args:

tensors (sequence of Tensors): any python sequence of tensors of the same type.: Non-empty tensors provided must have the same shape, except in the cat dimension.

dim (int, optional): the dimension over which the tensors are concatenated

Keyword args:

out (Tensor, optional): the output tensor.

Example:

>>> x = torch.randn(2, 3)
>>> x
tensor([[ 0.6580, -1.0969, -0.4614],
        [-0.1034, -0.5790,  0.1497]])
>>> torch.cat((x, x, x), 0)
tensor([[ 0.6580, -1.0969, -0.4614],
        [-0.1034, -0.5790,  0.1497],
        [ 0.6580, -1.0969, -0.4614],
        [-0.1034, -0.5790,  0.1497],
        [ 0.6580, -1.0969, -0.4614],
        [-0.1034, -0.5790,  0.1497]])
>>> torch.cat((x, x, x), 1)
tensor([[ 0.6580, -1.0969, -0.4614,  0.6580, -1.0969, -0.4614,  0.6580,
         -1.0969, -0.4614],
        [-0.1034, -0.5790,  0.1497, -0.1034, -0.5790,  0.1497, -0.1034,
         -0.5790,  0.1497]])

d2l.torch.copyfile(filename, target_dir)[source]¶: Copy a file into a target directory.

d2l.torch.corr2d(X, K)[source]¶: Compute 2D cross-correlation.

d2l.torch.cos(input, *, out=None) → Tensor¶

Returns a new tensor with the cosine of the elements of input.

(19.7.4)¶\[\text{out}_{i} = \cos(\text{input}_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(4)
>>> a
tensor([ 1.4309,  1.2706, -0.8562,  0.9796])
>>> torch.cos(a)
tensor([ 0.1395,  0.2957,  0.6553,  0.5574])

d2l.torch.cosh(input, *, out=None) → Tensor¶

Returns a new tensor with the hyperbolic cosine of the elements of input.

(19.7.5)¶\[\text{out}_{i} = \cosh(\text{input}_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(4)
>>> a
tensor([ 0.1632,  1.1835, -0.6979, -0.7325])
>>> torch.cosh(a)
tensor([ 1.0133,  1.7860,  1.2536,  1.2805])

Note

When input is on the CPU, the implementation of torch.cosh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.

d2l.torch.count_corpus(tokens)[source]¶: Count token frequencies.

class d2l.torch.defaultdict¶

Bases: dict

defaultdict(default_factory[, …]) –> dict with default factory

The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.

copy() → a shallow copy of D.¶

default_factory¶: Factory for default value called by __missing__().

d2l.torch.download(name, cache_dir='../data')[source]¶: Download a file inserted into DATA_HUB, return the local filename.

d2l.torch.download_all()[source]¶: Download all files in the DATA_HUB.

d2l.torch.download_extract(name, folder=None)[source]¶: Download and extract a zip/tar file.

d2l.torch.evaluate_accuracy(net, data_iter)[source]¶: Compute the accuracy for a model on a dataset.

d2l.torch.evaluate_accuracy_gpu(net, data_iter, device=None)[source]¶: Compute the accuracy for a model on a dataset using a GPU.

d2l.torch.evaluate_loss(net, data_iter, loss)[source]¶: Evaluate the loss of a model on the given dataset.

d2l.torch.exp(input, *, out=None) → Tensor¶

Returns a new tensor with the exponential of the elements of the input tensor input.

(19.7.6)¶\[y_{i} = e^{x_{i}}\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> torch.exp(torch.tensor([0, math.log(2.)]))
tensor([ 1.,  2.])

d2l.torch.eye(n, m=None, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a 2-D tensor with ones on the diagonal and zeros elsewhere.

Args:

n (int): the number of rows m (int, optional): the number of columns with default being n

Keyword arguments:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Returns:

Tensor: A 2-D tensor with ones on the diagonal and zeros elsewhere

Example:

>>> torch.eye(3)
tensor([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]])

d2l.torch.get_centers_and_contexts(corpus, max_window_size)[source]¶

d2l.torch.get_data_ch11(batch_size=10, n=1500)[source]¶

d2l.torch.get_dataloader_workers()[source]¶: Use 4 processes to read the data.

d2l.torch.get_fashion_mnist_labels(labels)[source]¶: Return text labels for the Fashion-MNIST dataset.

d2l.torch.get_negatives(all_contexts, corpus, K)[source]¶

d2l.torch.get_tokens_and_segments(tokens_a, tokens_b=None)[source]¶

d2l.torch.grad_clipping(net, theta)[source]¶: Clip the gradient.

d2l.torch.linreg(X, w, b)[source]¶: The linear regression model.

d2l.torch.linspace(start, end, steps, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Creates a one-dimensional tensor of size steps whose values are evenly spaced from start to end, inclusive. That is, the value are:

(19.7.7)¶\[(\text{start}, \text{start} + \frac{\text{end} - \text{start}}{\text{steps} - 1}, \ldots, \text{start} + (\text{steps} - 2) * \frac{\text{end} - \text{start}}{\text{steps} - 1}, \text{end})\]

Warning

Not providing a value for steps is deprecated. For backwards compatibility, not providing a value for steps will create a tensor with 100 elements. Note that this behavior is not reflected in the documented function signature and should not be relied on. In a future PyTorch release, failing to provide a value for steps will throw a runtime error.

Args:

start (float): the starting value for the set of points end (float): the ending value for the set of points steps (int): size of the constructed tensor

Keyword arguments:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.linspace(3, 10, steps=5)
tensor([  3.0000,   4.7500,   6.5000,   8.2500,  10.0000])
>>> torch.linspace(-10, 10, steps=5)
tensor([-10.,  -5.,   0.,   5.,  10.])
>>> torch.linspace(start=-10, end=10, steps=5)
tensor([-10.,  -5.,   0.,   5.,  10.])
>>> torch.linspace(start=-10, end=10, steps=1)
tensor([-10.])

d2l.torch.load_array(data_arrays, batch_size, is_train=True)[source]¶: Construct a PyTorch data iterator.

d2l.torch.load_corpus_time_machine(max_tokens=- 1)[source]¶: Return token indices and the vocabulary of the time machine dataset.

d2l.torch.load_data_bananas(batch_size)[source]¶: Load the bananas dataset.

d2l.torch.load_data_fashion_mnist(batch_size, resize=None)[source]¶: Download the Fashion-MNIST dataset and then load it into memory.

d2l.torch.load_data_imdb(batch_size, num_steps=500)[source]¶

d2l.torch.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶: Return the iterator and the vocabularies of the translation dataset.

d2l.torch.load_data_ptb(batch_size, max_window_size, num_noise_words)[source]¶

d2l.torch.load_data_snli(batch_size, num_steps=50)[source]¶: Download the SNLI dataset and return data iterators and vocabulary.

d2l.torch.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶: Return the iterator and the vocabulary of the time machine dataset.

d2l.torch.load_data_voc(batch_size, crop_size)[source]¶: Download and load the VOC2012 semantic dataset.

d2l.torch.load_data_wiki(batch_size, max_len)[source]¶

d2l.torch.log(input, *, out=None) → Tensor¶

Returns a new tensor with the natural logarithm of the elements of input.

(19.7.8)¶\[y_{i} = \log_{e} (x_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(5)
>>> a
tensor([-0.7168, -0.5471, -0.8933, -1.4428, -0.1190])
>>> torch.log(a)
tensor([ nan,  nan,  nan,  nan,  nan])

d2l.torch.masked_softmax(X, valid_lens)[source]¶: Perform softmax operation by masking elements on the last axis.

d2l.torch.match_anchor_to_bbox(ground_truth, anchors, device, iou_threshold=0.5)[source]¶: Assign ground-truth bounding boxes to anchor boxes similar to them.

d2l.torch.matmul(input, other, *, out=None) → Tensor¶

Matrix product of two tensors.

The behavior depends on the dimensionality of the tensors as follows:

If both tensors are 1-dimensional, the dot product (scalar) is returned.
If both arguments are 2-dimensional, the matrix-matrix product is returned.
If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.
If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.
If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned. If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after. If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after. The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if input is a \((j \times 1 \times n \times n)\) tensor and other is a \((k \times n \times n)\) tensor, out will be a \((j \times k \times n \times n)\) tensor.

Note that the broadcasting logic only looks at the batch dimensions when determining if the inputs are broadcastable, and not the matrix dimensions. For example, if input is a \((j \times 1 \times n \times m)\) tensor and other is a \((k \times m \times p)\) tensor, these inputs are valid for broadcasting even though the final two dimensions (i.e. the matrix dimensions) are different. out will be a \((j \times k \times n \times p)\) tensor.

This operator supports TensorFloat32.

Note

The 1-dimensional dot product version of this function does not support an out parameter.

Arguments:: input (Tensor): the first tensor to be multiplied other (Tensor): the second tensor to be multiplied
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> # vector x vector
>>> tensor1 = torch.randn(3)
>>> tensor2 = torch.randn(3)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([])
>>> # matrix x vector
>>> tensor1 = torch.randn(3, 4)
>>> tensor2 = torch.randn(4)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([3])
>>> # batched matrix x broadcasted vector
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(4)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3])
>>> # batched matrix x batched matrix
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(10, 4, 5)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3, 5])
>>> # batched matrix x broadcasted matrix
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(4, 5)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3, 5])

d2l.torch.meshgrid(*tensors)[source]¶

Take \(N\) tensors, each of which can be either scalar or 1-dimensional vector, and create \(N\) N-dimensional grids, where the \(i\) ^th grid is defined by expanding the \(i\) ^th input over dimensions defined by other inputs.

Args:

tensors (list of Tensor): list of scalars or 1 dimensional tensors. Scalars will be: treated as tensors of size \((1,)\) automatically

Returns:

seq (sequence of Tensors): If the input has \(k\) tensors of size \((N_1,), (N_2,), \ldots , (N_k,)\), then the output would also have \(k\) tensors, where all tensors are of size \((N_1, N_2, \ldots , N_k)\).

Example:

>>> x = torch.tensor([1, 2, 3])
>>> y = torch.tensor([4, 5, 6])
>>> grid_x, grid_y = torch.meshgrid(x, y)
>>> grid_x
tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])
>>> grid_y
tensor([[4, 5, 6],
        [4, 5, 6],
        [4, 5, 6]])

d2l.torch.multibox_detection(cls_probs, offset_preds, anchors, nms_threshold=0.5, pos_threshold=0.00999999978)[source]¶

d2l.torch.multibox_prior(data, sizes, ratios)[source]¶

d2l.torch.multibox_target(anchors, labels)[source]¶

d2l.torch.nms(boxes, scores, iou_threshold)[source]¶

d2l.torch.normal(mean, std, *, generator=None, out=None) → Tensor¶

Returns a tensor of random numbers drawn from separate normal distributions whose mean and standard deviation are given.

The mean is a tensor with the mean of each output element’s normal distribution

The std is a tensor with the standard deviation of each output element’s normal distribution

The shapes of mean and std don’t need to match, but the total number of elements in each tensor need to be the same.

Note

When the shapes do not match, the shape of mean is used as the shape for the returned output tensor

Args:: mean (Tensor): the tensor of per-element means std (Tensor): the tensor of per-element standard deviations
Keyword args:: generator (torch.Generator, optional): a pseudorandom number generator for sampling out (Tensor, optional): the output tensor.

Example:

>>> torch.normal(mean=torch.arange(1., 11.), std=torch.arange(1, 0, -0.1))
tensor([  1.0425,   3.5672,   2.7969,   4.2925,   4.7229,   6.2134,
          8.0505,   8.1408,   9.0563,  10.0566])

d2l.torch.normal(mean=0.0, std, *, out=None) → Tensor¶

Similar to the function above, but the means are shared among all drawn elements.

Args:: mean (float, optional): the mean for all distributions std (Tensor): the tensor of per-element standard deviations
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> torch.normal(mean=0.5, std=torch.arange(1., 6.))
tensor([-1.2793, -1.0732, -2.0687,  5.1177, -1.2303])

d2l.torch.normal(mean, std=1.0, *, out=None) → Tensor¶

Similar to the function above, but the standard-deviations are shared among all drawn elements.

Args:: mean (Tensor): the tensor of per-element means std (float, optional): the standard deviation for all distributions
Keyword args:: out (Tensor, optional): the output tensor

Example:

>>> torch.normal(mean=torch.arange(1., 6.))
tensor([ 1.1552,  2.6148,  2.6535,  5.8318,  4.2361])

d2l.torch.normal(mean, std, size, *, out=None) → Tensor¶

Similar to the function above, but the means and standard deviations are shared among all drawn elements. The resulting tensor has size given by size.

Args:: mean (float): the mean for all distributions std (float): the standard deviation for all distributions size (int…): a sequence of integers defining the shape of the output tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> torch.normal(2, 3, size=(1, 4))
tensor([[-1.3987, -1.9544,  3.6048,  0.7909]])

d2l.torch.numpy(x, *args, **kwargs)¶

d2l.torch.offset_boxes(anchors, assigned_bb, eps=1e-06)[source]¶

d2l.torch.offset_inverse(anchors, offset_preds)[source]¶

d2l.torch.ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument size.

Args:

size (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

Keyword arguments:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])

d2l.torch.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶: Plot data points.

d2l.torch.predict_ch3(net, test_iter, n=6)[source]¶: Predict labels (defined in Chapter 3).

d2l.torch.predict_ch8(prefix, num_preds, net, vocab, device)[source]¶: Generate new characters following the prefix.

d2l.torch.predict_sentiment(net, vocab, sentence)[source]¶

d2l.torch.predict_seq2seq(net, src_sentence, src_vocab, tgt_vocab, num_steps, device, save_attention_weights=False)[source]¶: Predict for sequence to sequence.

d2l.torch.predict_snli(net, vocab, premise, hypothesis)[source]¶

d2l.torch.preprocess_nmt(text)[source]¶: Preprocess the English-French dataset.

d2l.torch.rand(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with random numbers from a uniform distribution on the interval \([0, 1)\)

The shape of the tensor is defined by the variable argument size.

Args:

size (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

Keyword args:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.rand(4)
tensor([ 0.5204,  0.2503,  0.3525,  0.5673])
>>> torch.rand(2, 3)
tensor([[ 0.8237,  0.5781,  0.6879],
        [ 0.3816,  0.7249,  0.0998]])

d2l.torch.read_csv_labels(fname)[source]¶: Read fname to return a name to label dictionary.

d2l.torch.read_data_bananas(is_train=True)[source]¶: Read the bananas dataset images and labels.

d2l.torch.read_data_nmt()[source]¶: Load the English-French dataset.

d2l.torch.read_imdb(data_dir, is_train)[source]¶

d2l.torch.read_ptb()[source]¶

d2l.torch.read_snli(data_dir, is_train)[source]¶: Read the SNLI dataset into premises, hypotheses, and labels.

d2l.torch.read_time_machine()[source]¶: Load the time machine dataset into a list of text lines.

d2l.torch.read_voc_images(voc_dir, is_train=True)[source]¶: Read all VOC feature and label images.

d2l.torch.reduce_sum(x, *args, **kwargs)¶

d2l.torch.reorg_test(data_dir)[source]¶

d2l.torch.reorg_train_valid(data_dir, labels, valid_ratio)[source]¶

d2l.torch.reshape(x, *args, **kwargs)¶

d2l.torch.resnet18(num_classes, in_channels=1)[source]¶: A slightly modified ResNet-18 model.

d2l.torch.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using random sampling.

d2l.torch.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using sequential partitioning.

d2l.torch.sequence_mask(X, valid_len, value=0)[source]¶: Mask irrelevant entries in sequences.

d2l.torch.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶: Set the axes for matplotlib.

d2l.torch.set_figsize(figsize=(3.5, 2.5))[source]¶: Set the figure size for matplotlib.

d2l.torch.sgd(params, lr, batch_size)[source]¶: Minibatch stochastic gradient descent.

d2l.torch.show_bboxes(axes, bboxes, labels=None, colors=None)[source]¶: Show bounding boxes.

d2l.torch.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶

d2l.torch.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶: Plot a list of images.

d2l.torch.show_trace_2d(f, results)[source]¶: Show the trace of 2D variables during optimization.

d2l.torch.sin(input, *, out=None) → Tensor¶

Returns a new tensor with the sine of the elements of input.

(19.7.9)¶\[\text{out}_{i} = \sin(\text{input}_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(4)
>>> a
tensor([-0.5461,  0.1347, -2.7266, -0.2746])
>>> torch.sin(a)
tensor([-0.5194,  0.1343, -0.4032, -0.2711])

d2l.torch.sinh(input, *, out=None) → Tensor¶

Returns a new tensor with the hyperbolic sine of the elements of input.

(19.7.10)¶\[\text{out}_{i} = \sinh(\text{input}_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(4)
>>> a
tensor([ 0.5380, -0.8632, -0.1265,  0.9399])
>>> torch.sinh(a)
tensor([ 0.5644, -0.9744, -0.1268,  1.0845])

Note

When input is on the CPU, the implementation of torch.sinh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.

d2l.torch.size(x, *args, **kwargs)¶

d2l.torch.split_batch(X, y, devices)[source]¶: Split X and y into multiple devices.

d2l.torch.squared_loss(y_hat, y)[source]¶: Squared loss.

d2l.torch.stack(tensors, dim=0, *, out=None) → Tensor¶

Concatenates a sequence of tensors along a new dimension.

All tensors need to be of the same size.

Arguments:: tensors (sequence of Tensors): sequence of tensors to concatenate dim (int): dimension to insert. Has to be between 0 and the number

of dimensions of concatenated tensors (inclusive)
Keyword args:: out (Tensor, optional): the output tensor.

d2l.torch.subsampling(sentences, vocab)[source]¶

d2l.torch.synthetic_data(w, b, num_examples)[source]¶: Generate y = Xw + b + noise.

d2l.torch.tanh(input, *, out=None) → Tensor¶

Returns a new tensor with the hyperbolic tangent of the elements of input.

(19.7.11)¶\[\text{out}_{i} = \tanh(\text{input}_{i})\]

Args:: input (Tensor): the input tensor.
Keyword args:: out (Tensor, optional): the output tensor.

Example:

>>> a = torch.randn(4)
>>> a
tensor([ 0.8986, -0.7279,  1.1745,  0.2611])
>>> torch.tanh(a)
tensor([ 0.7156, -0.6218,  0.8257,  0.2553])

d2l.torch.tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor¶

Constructs a tensor with data.

Warning

torch.tensor() always copies data. If you have a Tensor data and want to avoid a copy, use torch.Tensor.requires_grad_() or torch.Tensor.detach(). If you have a NumPy ndarray and want to avoid a copy, use torch.as_tensor().

Warning

When data is a tensor x, torch.tensor() reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Therefore torch.tensor(x) is equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) is equivalent to x.clone().detach().requires_grad_(True). The equivalents using clone() and detach() are recommended.

Args:

data (array_like): Initial data for the tensor. Can be a list, tuple,: NumPy ndarray, scalar, and other types.

Keyword args:

dtype (torch.dtype, optional): the desired data type of returned tensor.: Default: if None, infers data type from data.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.
pin_memory (bool, optional): If set, returned tensor would be allocated in: the pinned memory. Works only for CPU tensors. Default: False.

Example:

>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
tensor([[ 0.1000,  1.2000],
        [ 2.2000,  3.1000],
        [ 4.9000,  5.2000]])

>>> torch.tensor([0, 1])  # Type inference on data
tensor([ 0,  1])

>>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
...              dtype=torch.float64,
...              device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')

>>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
tensor(3.1416)

>>> torch.tensor([])  # Create an empty tensor (of size (0,))
tensor([])

d2l.torch.to(x, *args, **kwargs)¶

d2l.torch.tokenize(lines, token='word')[source]¶: Split text lines into word or character tokens.

d2l.torch.tokenize_nmt(text, num_examples=None)[source]¶: Tokenize the English-French dataset.

d2l.torch.train_2d(trainer, steps=20)[source]¶: Optimize a 2-dim objective function with a customized trainer.

d2l.torch.train_batch_ch13(net, X, y, loss, trainer, devices)[source]¶

d2l.torch.train_ch11(trainer_fn, states, hyperparams, data_iter, feature_dim, num_epochs=2)[source]¶

d2l.torch.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=[device(type='cuda', index=0), device(type='cuda', index=1), device(type='cuda', index=2), device(type='cuda', index=3)])[source]¶

d2l.torch.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)[source]¶: Train a model (defined in Chapter 3).

d2l.torch.train_ch6(net, train_iter, test_iter, num_epochs, lr, device)[source]¶: Train a model with a GPU (defined in Chapter 6).

d2l.torch.train_ch8(net, train_iter, vocab, lr, num_epochs, device, use_random_iter=False)[source]¶: Train a model (defined in Chapter 8).

d2l.torch.train_concise_ch11(trainer_fn, hyperparams, data_iter, num_epochs=4)[source]¶

d2l.torch.train_epoch_ch3(net, train_iter, loss, updater)[source]¶: The training loop defined in Chapter 3.

d2l.torch.train_epoch_ch8(net, train_iter, loss, updater, device, use_random_iter)[source]¶: Train a net within one epoch (defined in Chapter 8).

d2l.torch.train_seq2seq(net, data_iter, lr, num_epochs, tgt_vocab, device)[source]¶: Train a model for sequence to sequence.

d2l.torch.transpose(x, *args, **kwargs)¶

d2l.torch.transpose_output(X, num_heads)[source]¶: Reverse the operation of transpose_qkv

d2l.torch.transpose_qkv(X, num_heads)[source]¶

d2l.torch.truncate_pad(line, num_steps, padding_token)[source]¶: Truncate or pad sequences.

d2l.torch.try_all_gpus()[source]¶: Return all available GPUs, or [cpu(),] if no GPU exists.

d2l.torch.try_gpu(i=0)[source]¶: Return gpu(i) if exists, otherwise return cpu().

d2l.torch.update_D(X, Z, net_D, net_G, loss, trainer_D)[source]¶: Update discriminator.

d2l.torch.update_G(Z, net_D, net_G, loss, trainer_G)[source]¶: Update generator.

d2l.torch.use_svg_display()[source]¶: Use the svg format to display a plot in Jupyter.

d2l.torch.voc_label_indices(colormap, colormap2label)[source]¶: Map an RGB color to a label.

d2l.torch.voc_rand_crop(feature, label, height, width)[source]¶: Randomly crop for both feature and label images.

d2l.torch.zeros(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument size.

Args:

size (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

Keyword args:

out (Tensor, optional): the output tensor. dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

>>> torch.zeros(5)
tensor([ 0.,  0.,  0.,  0.,  0.])

class d2l.tensorflow.Accumulator(n)[source]¶

Bases: object

For accumulating sums over n variables.

add(*args)[source]¶

reset()[source]¶

class d2l.tensorflow.Animator(xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), nrows=1, ncols=1, figsize=(3.5, 2.5))[source]¶

Bases: object

For plotting data in animation.

add(x, y)[source]¶

class d2l.tensorflow.RNNModelScratch(vocab_size, num_hiddens, init_state, forward_fn)[source]¶

Bases: object

A RNN Model implemented from scratch.

begin_state(batch_size)[source]¶

class d2l.tensorflow.Residual(*args, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.training.Model

The Residual block of ResNet.

call(X)[source]¶

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Arguments:

inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run

the Network in training mode or inference mode.

mask: A mask or list of masks. A mask can be: either a tensor or None (no mask).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

class d2l.tensorflow.SeqDataLoader(batch_size, num_steps, use_random_iter, max_tokens)[source]¶

Bases: object

An iterator to load sequence data.

class d2l.tensorflow.Timer[source]¶

Bases: object

Record multiple running times.

avg()[source]¶: Return the average time.

cumsum()[source]¶: Return the accumulated time.

start()[source]¶: Start the timer.

stop()[source]¶: Stop the timer and record the time in a list.

sum()[source]¶: Return the sum of time.

class d2l.tensorflow.TrainCallback(net, train_iter, test_iter, num_epochs, device_name)[source]¶

Bases: tensorflow.python.keras.callbacks.Callback

A callback to visiualize the training progress.

on_epoch_begin(epoch, logs=None)[source]¶

Called at the start of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Arguments:: epoch: Integer, index of epoch. logs: Dict. Currently no data is passed to this argument for this method

but that may change in the future.

on_epoch_end(epoch, logs)[source]¶

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Arguments:: epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_.

class d2l.tensorflow.Updater(params, lr)[source]¶

Bases: object

For updating parameters using minibatch stochastic gradient descent.

class d2l.tensorflow.Vocab(tokens=None, min_freq=0, reserved_tokens=None)[source]¶

Bases: object

Vocabulary for text.

to_tokens(indices)[source]¶

d2l.tensorflow.abs(x, name=None)[source]¶

Computes the absolute value of a tensor.

Given a tensor of integer or floating-point values, this operation returns a tensor of the same type, where each element contains the absolute value of the corresponding element in the input.

Given a tensor x of complex numbers, this operation returns a tensor of type float32 or float64 that is the absolute value of each element in x. For a complex number \(a + bj\), its absolute value is computed as \(sqrt{a^2 + b^2}\). For example:

>>> x = tf.constant([[-2.25 + 4.75j], [-3.25 + 5.75j]])
>>> tf.abs(x)
<tf.Tensor: shape=(2, 1), dtype=float64, numpy=
array([[5.25594901],
       [6.60492241]])>

Args:

x: A Tensor or SparseTensor of type float16, float32, float64,: int32, int64, complex64 or complex128.

Returns:

A Tensor or SparseTensor of the same size, type and sparsity as x,: with absolute values. Note, for complex64 or complex128 input, the returned Tensor will be of type float32 or float64, respectively.

If x is a SparseTensor, returns SparseTensor(x.indices, tf.math.abs(x.values, …), x.dense_shape)

d2l.tensorflow.accuracy(y_hat, y)[source]¶: Compute the number of correct predictions.

d2l.tensorflow.annotate(text, xy, xytext)[source]¶

d2l.tensorflow.arange(start, limit=None, delta=1, dtype=None, name='range')¶

Creates a sequence of numbers.

Creates a sequence of numbers that begins at start and extends by increments of delta up to but not including limit.

The dtype of the resulting tensor is inferred from the inputs unless it is provided explicitly.

Like the Python builtin range, start defaults to 0, so that range(n) = range(0, n).

For example:

>>> start = 3
>>> limit = 18
>>> delta = 3
>>> tf.range(start, limit, delta)
<tf.Tensor: shape=(5,), dtype=int32,
numpy=array([ 3,  6,  9, 12, 15], dtype=int32)>

>>> start = 3
>>> limit = 1
>>> delta = -0.5
>>> tf.range(start, limit, delta)
<tf.Tensor: shape=(4,), dtype=float32,
numpy=array([3. , 2.5, 2. , 1.5], dtype=float32)>

>>> limit = 5
>>> tf.range(limit)
<tf.Tensor: shape=(5,), dtype=int32,
numpy=array([0, 1, 2, 3, 4], dtype=int32)>

Args:

start: A 0-D Tensor (scalar). Acts as first entry in the range if limit

is not None; otherwise, acts as range limit and first entry defaults to 0.

limit: A 0-D Tensor (scalar). Upper limit of sequence, exclusive. If None,

defaults to the value of start while the first entry of the range defaults to 0.

delta: A 0-D Tensor (scalar). Number that increments start. Defaults to

dtype: The type of the elements of the resulting tensor. name: A name for the operation. Defaults to “range”.

Returns:

An 1-D Tensor of type dtype.

@compatibility(numpy) Equivalent to np.arange @end_compatibility

d2l.tensorflow.argmax(input, axis=None, output_type=tf.int64, name=None)[source]¶

Returns the index with the largest value across axes of a tensor.

In case of identity returns the smallest index.

For example:

>>> A = tf.constant([2, 20, 30, 3, 6])
>>> tf.math.argmax(A)  # A[2] is maximum in tensor A
<tf.Tensor: shape=(), dtype=int64, numpy=2>
>>> B = tf.constant([[2, 20, 30, 3, 6], [3, 11, 16, 1, 8],
...                  [14, 45, 23, 5, 27]])
>>> tf.math.argmax(B, 0)
<tf.Tensor: shape=(5,), dtype=int64, numpy=array([2, 2, 0, 2, 2])>
>>> tf.math.argmax(B, 1)
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([2, 2, 1])>
>>> C = tf.constant([0, 0, 0, 0])
>>> tf.math.argmax(C) # Returns smallest index in case of ties
<tf.Tensor: shape=(), dtype=int64, numpy=0>

Args:

input: A Tensor. axis: An integer, the axis to reduce across. Default to 0. output_type: An optional output dtype (tf.int32 or tf.int64). Defaults

to tf.int64.

Returns:

A Tensor of type output_type.

d2l.tensorflow.astype(x, dtype, name=None)¶

Casts a tensor to a new type.

The operation casts x (in case of Tensor) or x.values (in case of SparseTensor or IndexedSlices) to dtype.

For example:

>>> x = tf.constant([1.8, 2.2], dtype=tf.float32)
>>> tf.dtypes.cast(x, tf.int32)
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>

The operation supports data types (for x and dtype) of uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, complex64, complex128, bfloat16. In case of casting from complex types (complex64, complex128) to real types, only the real part of x is returned. In case of casting from real types to complex types (complex64, complex128), the imaginary part of the returned value is set to 0. The handling of complex types here matches the behavior of numpy.

Args:

x: A Tensor or SparseTensor or IndexedSlices of numeric type. It could: be uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, complex64, complex128, bfloat16.
dtype: The destination type. The list of supported dtypes is the same as: x.

Returns:

A Tensor or SparseTensor or IndexedSlices with same shape as x and: same type as dtype.

Raises:

TypeError: If x cannot be cast to the dtype.

d2l.tensorflow.bbox_to_rect(bbox, color)[source]¶: Convert bounding box to matplotlib format.

d2l.tensorflow.box_center_to_corner(boxes)[source]¶: Convert from (center, width, height) to (upper_left, bottom_right)

d2l.tensorflow.box_corner_to_center(boxes)[source]¶: Convert from (upper_left, bottom_right) to (center, width, height)

d2l.tensorflow.build_array_nmt(lines, vocab, num_steps)[source]¶: Transform text sequences of machine translation into minibatches.

d2l.tensorflow.concat(values, axis, name='concat')[source]¶

Concatenates tensors along one dimension.

See also tf.tile, tf.stack, tf.repeat.

Concatenates the list of tensors values along dimension axis. If values[i].shape = [D0, D1, … Daxis(i), …Dn], the concatenated result has shape

[D0, D1, … Raxis, …Dn]

where

Raxis = sum(Daxis(i))

That is, the data from the input tensors is joined along the axis dimension.

The number of dimensions of the input tensors must match, and all dimensions except axis must be equal.

For example:

>>> t1 = [[1, 2, 3], [4, 5, 6]]
>>> t2 = [[7, 8, 9], [10, 11, 12]]
>>> tf.concat([t1, t2], 0)
<tf.Tensor: shape=(4, 3), dtype=int32, numpy=
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]], dtype=int32)>

>>> tf.concat([t1, t2], 1)
<tf.Tensor: shape=(2, 6), dtype=int32, numpy=
array([[ 1,  2,  3,  7,  8,  9],
       [ 4,  5,  6, 10, 11, 12]], dtype=int32)>

As in Python, the axis could also be negative numbers. Negative axis are interpreted as counting from the end of the rank, i.e.,

axis + rank(values)-th dimension.

For example:

>>> t1 = [[[1, 2], [2, 3]], [[4, 4], [5, 3]]]
>>> t2 = [[[7, 4], [8, 4]], [[2, 10], [15, 11]]]
>>> tf.concat([t1, t2], -1)
<tf.Tensor: shape=(2, 2, 4), dtype=int32, numpy=
  array([[[ 1,  2,  7,  4],
          [ 2,  3,  8,  4]],
         [[ 4,  4,  2, 10],
          [ 5,  3, 15, 11]]], dtype=int32)>

Note: If you are concatenating along a new axis consider using stack. E.g.

`python tf.concat([tf.expand_dims(t, axis) for t in tensors], axis) `

can be rewritten as

`python tf.stack(tensors, axis=axis) `

Args:

values: A list of Tensor objects or a single Tensor. axis: 0-D int32 Tensor. Dimension along which to concatenate. Must be

in the range [-rank(values), rank(values)). As in Python, indexing for axis is 0-based. Positive axis in the rage of [0, rank(values)) refers to axis-th dimension. And negative axis refers to axis + rank(values)-th dimension.

Returns:

A Tensor resulting from concatenation of the input tensors.

d2l.tensorflow.corr2d(X, K)[source]¶: Compute 2D cross-correlation.

d2l.tensorflow.cos(x, name=None)[source]¶

Computes cos of x element-wise.

Given an input tensor, this function computes cosine of every element in the tensor. Input range is (-inf, inf) and output range is [-1,1]. If input lies outside the boundary, nan is returned.

`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 200, 10000, float("inf")]) tf.math.cos(x) ==> [nan -0.91113025 0.87758255 0.5403023 0.36235774 0.48718765 -0.95215535 nan] `

Args:: x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
Returns:: A Tensor. Has the same type as x.

d2l.tensorflow.cosh(x, name=None)[source]¶

Computes hyperbolic cosine of x element-wise.

Given an input tensor, this function computes hyperbolic cosine of every element in the tensor. Input range is [-inf, inf] and output range is [1, inf].

`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 2, 10, float("inf")]) tf.math.cosh(x) ==> [inf 4.0515420e+03 1.1276259e+00 1.5430807e+00 1.8106556e+00 3.7621956e+00 1.1013233e+04 inf] `

Args:: x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
Returns:: A Tensor. Has the same type as x.

d2l.tensorflow.count_corpus(tokens)[source]¶: Count token frequencies.

class d2l.tensorflow.defaultdict¶

Bases: dict

defaultdict(default_factory[, …]) –> dict with default factory

The default factory is called without arguments to produce a new value when a key is not present, in __getitem__ only. A defaultdict compares equal to a dict with the same items. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.

copy() → a shallow copy of D.¶

default_factory¶: Factory for default value called by __missing__().

d2l.tensorflow.download(name, cache_dir='../data')[source]¶: Download a file inserted into DATA_HUB, return the local filename.

d2l.tensorflow.download_all()[source]¶: Download all files in the DATA_HUB.

d2l.tensorflow.download_extract(name, folder=None)[source]¶: Download and extract a zip/tar file.

d2l.tensorflow.evaluate_accuracy(net, data_iter)[source]¶: Compute the accuracy for a model on a dataset.

d2l.tensorflow.evaluate_loss(net, data_iter, loss)[source]¶: Evaluate the loss of a model on the given dataset.

d2l.tensorflow.exp(x, name=None)[source]¶

Computes exponential of x element-wise. \(y = e^x\).

This function computes the exponential of the input tensor element-wise. i.e. math.exp(x) or \(e^x\), where x is the input tensor. \(e\) denotes Euler’s number and is approximately equal to 2.718281. Output is positive for any real input.

>>> x = tf.constant(2.0)
>>> tf.math.exp(x)
<tf.Tensor: shape=(), dtype=float32, numpy=7.389056>

>>> x = tf.constant([2.0, 8.0])
>>> tf.math.exp(x)
<tf.Tensor: shape=(2,), dtype=float32,
numpy=array([   7.389056, 2980.958   ], dtype=float32)>

For complex numbers, the exponential value is calculated as \(e^{x+iy}={e^x}{e^{iy}}={e^x}{\cos(y)+i\sin(y)}\)

For 1+1j the value would be computed as: \(e^1{\cos(1)+i\sin(1)} = 2.7182817 \times (0.5403023+0.84147096j)\)

>>> x = tf.constant(1 + 1j)
>>> tf.math.exp(x)
<tf.Tensor: shape=(), dtype=complex128,
numpy=(1.4686939399158851+2.2873552871788423j)>

Args:

x: A tf.Tensor. Must be one of the following types: bfloat16, half,: float32, float64, complex64, complex128.

Returns:

A tf.Tensor. Has the same type as x.

@compatibility(numpy) Equivalent to np.exp @end_compatibility

d2l.tensorflow.eye(num_rows, num_columns=None, batch_shape=None, dtype=tf.float32, name=None)[source]¶

Construct an identity matrix, or a batch of matrices.

See also tf.ones, tf.zeros, tf.fill, tf.one_hot.

```python # Construct one identity matrix. tf.eye(2) ==> [[1., 0.],

[0., 1.]]

# Construct a batch of 3 identity matrices, each 2 x 2. # batch_identity[i, :, :] is a 2 x 2 identity matrix, i = 0, 1, 2. batch_identity = tf.eye(2, batch_shape=[3])

# Construct one 2 x 3 “identity” matrix tf.eye(2, num_columns=3) ==> [[ 1., 0., 0.],

[ 0., 1., 0.]]

```

Args:

num_rows: Non-negative int32 scalar Tensor giving the number of rows: in each batch matrix.
num_columns: Optional non-negative int32 scalar Tensor giving the number: of columns in each batch matrix. Defaults to num_rows.
batch_shape: A list or tuple of Python integers or a 1-D int32 Tensor.: If provided, the returned Tensor will have leading batch dimensions of this shape.

dtype: The type of an element in the resulting Tensor name: A name for this Op. Defaults to “eye”.

Returns:

A Tensor of shape batch_shape + [num_rows, num_columns]

d2l.tensorflow.get_data_ch11(batch_size=10, n=1500)[source]¶

d2l.tensorflow.get_fashion_mnist_labels(labels)[source]¶: Return text labels for the Fashion-MNIST dataset.

d2l.tensorflow.grad_clipping(grads, theta)[source]¶: Clip the gradient.

d2l.tensorflow.linreg(X, w, b)[source]¶: The linear regression model.

d2l.tensorflow.linspace(start, stop, num, name=None, axis=0)¶

Generates evenly-spaced values in an interval along a given axis.

A sequence of num evenly-spaced values are generated beginning at start along a given axis. If num > 1, the values in the sequence increase by stop - start / num - 1, so that the last one is exactly stop. If num <= 0, ValueError is raised.

Matches [np.linspace](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html)’s behaviour except when num == 0.

For example:

` tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0 11.0 12.0] `

Start and stop can be tensors of arbitrary size:

>>> tf.linspace([0., 5.], [10., 40.], 5, axis=0)
<tf.Tensor: shape=(5, 2), dtype=float32, numpy=
array([[ 0.  ,  5.  ],
       [ 2.5 , 13.75],
       [ 5.  , 22.5 ],
       [ 7.5 , 31.25],
       [10.  , 40.  ]], dtype=float32)>

Axis is where the values will be generated (the dimension in the returned tensor which corresponds to the axis will be equal to num)

>>> tf.linspace([0., 5.], [10., 40.], 5, axis=-1)
<tf.Tensor: shape=(2, 5), dtype=float32, numpy=
array([[ 0.  ,  2.5 ,  5.  ,  7.5 , 10.  ],
       [ 5.  , 13.75, 22.5 , 31.25, 40.  ]], dtype=float32)>

Args:

start: A Tensor. Must be one of the following types: bfloat16,: float32, float64. N-D tensor. First entry in the range.
stop: A Tensor. Must have the same type and shape as start. N-D tensor.: Last entry in the range.
num: A Tensor. Must be one of the following types: int32, int64. 0-D: tensor. Number of values to generate.

name: A name for the operation (optional). axis: Axis along which the operation is performed (used only when N-D

tensors are provided).

Returns:

A Tensor. Has the same type as start.

d2l.tensorflow.load_array(data_arrays, batch_size, is_train=True)[source]¶: Construct a TensorFlow data iterator.

d2l.tensorflow.load_corpus_time_machine(max_tokens=- 1)[source]¶: Return token indices and the vocabulary of the time machine dataset.

d2l.tensorflow.load_data_fashion_mnist(batch_size, resize=None)[source]¶: Download the Fashion-MNIST dataset and then load it into memory.

d2l.tensorflow.load_data_nmt(batch_size, num_steps, num_examples=600)[source]¶: Return the iterator and the vocabularies of the translation dataset.

d2l.tensorflow.load_data_time_machine(batch_size, num_steps, use_random_iter=False, max_tokens=10000)[source]¶: Return the iterator and the vocabulary of the time machine dataset.

d2l.tensorflow.matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False, a_is_sparse=False, b_is_sparse=False, name=None)[source]¶

Multiplies matrix a by matrix b, producing a * b.

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication dimensions, and any further outer dimensions specify matching batch size.

Both matrices must be of the same type. The supported types are: float16, float32, float64, int32, complex64, complex128.

Either matrix can be transposed or adjointed (conjugated and transposed) on the fly by setting one of the corresponding flag to True. These are False by default.

If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding a_is_sparse or b_is_sparse flag to True. These are False by default. This optimization is only available for plain matrices (rank-2 tensors) with datatypes bfloat16 or float32.

A simple 2-D tensor matrix multiplication:

>>> a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3])
>>> a  # 2-D tensor
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6]], dtype=int32)>
>>> b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2])
>>> b  # 2-D tensor
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 7,  8],
       [ 9, 10],
       [11, 12]], dtype=int32)>
>>> c = tf.matmul(a, b)
>>> c  # `a` * `b`
<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 58,  64],
       [139, 154]], dtype=int32)>

A batch matrix multiplication with batch shape [2]:

>>> a = tf.constant(np.arange(1, 13, dtype=np.int32), shape=[2, 2, 3])
>>> a  # 3-D tensor
<tf.Tensor: shape=(2, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],
       [[ 7,  8,  9],
        [10, 11, 12]]], dtype=int32)>
>>> b = tf.constant(np.arange(13, 25, dtype=np.int32), shape=[2, 3, 2])
>>> b  # 3-D tensor
<tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
array([[[13, 14],
        [15, 16],
        [17, 18]],
       [[19, 20],
        [21, 22],
        [23, 24]]], dtype=int32)>
>>> c = tf.matmul(a, b)
>>> c  # `a` * `b`
<tf.Tensor: shape=(2, 2, 2), dtype=int32, numpy=
array([[[ 94, 100],
        [229, 244]],
       [[508, 532],
        [697, 730]]], dtype=int32)>

Since python >= 3.5 the @ operator is supported (see [PEP 465](https://www.python.org/dev/peps/pep-0465/)). In TensorFlow, it simply calls the tf.matmul() function, so the following lines are equivalent:

>>> d = a @ b @ [[10], [11]]
>>> d = tf.matmul(tf.matmul(a, b), [[10], [11]])

Args:

a: tf.Tensor of type float16, float32, float64, int32,: complex64, complex128 and rank > 1.

b: tf.Tensor with same type and rank as a. transpose_a: If True, a is transposed before multiplication. transpose_b: If True, b is transposed before multiplication. adjoint_a: If True, a is conjugated and transposed before

multiplication.

adjoint_b: If True, b is conjugated and transposed before: multiplication.
a_is_sparse: If True, a is treated as a sparse matrix. Notice, this: does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.
b_is_sparse: If True, b is treated as a sparse matrix. Notice, this: does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.

Returns:

A tf.Tensor of the same type as a and b where each inner-most matrix is the product of the corresponding matrices in a and b, e.g. if all transpose or adjoint attributes are False:

output[…, i, j] = sum_k (a[…, i, k] * b[…, k, j]), for all indices i, j.

Note: This is matrix product, not element-wise product.

Raises:

ValueError: If transpose_a and adjoint_a, or transpose_b and: adjoint_b are both set to True.

d2l.tensorflow.meshgrid(*args, **kwargs)[source]¶

Broadcasts parameters for evaluation on an N-D grid.

Given N one-dimensional coordinate arrays *args, returns a list outputs of N-D coordinate arrays for evaluating expressions on an N-D grid.

Notes:

meshgrid supports cartesian (‘xy’) and matrix (‘ij’) indexing conventions. When the indexing argument is set to ‘xy’ (the default), the broadcasting instructions for the first two dimensions are swapped.

Examples:

Calling X, Y = meshgrid(x, y) with the tensors

`python x = [1, 2, 3] y = [4, 5, 6] X, Y = tf.meshgrid(x, y) # X = [[1, 2, 3], # [1, 2, 3], # [1, 2, 3]] # Y = [[4, 4, 4], # [5, 5, 5], # [6, 6, 6]] `

Args:

*args: `Tensor`s with rank 1. **kwargs:

indexing: Either ‘xy’ or ‘ij’ (optional, default: ‘xy’).

name: A name for the operation (optional).

Returns:

outputs: A list of N `Tensor`s with rank N.

Raises:

TypeError: When no keyword arguments (kwargs) are passed. ValueError: When indexing keyword argument is not one of xy or ij.

d2l.tensorflow.normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)¶

Outputs random values from a normal distribution.

Example that generates a new set of random values every time:

>>> tf.random.set_seed(5);
>>> tf.random.normal([4], 0, 1, tf.float32)
<tf.Tensor: shape=(4,), dtype=float32, numpy=..., dtype=float32)>

Example that outputs a reproducible result:

>>> tf.random.set_seed(5);
>>> tf.random.normal([2,2], 0, 1, tf.float32, seed=1)
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-1.3768897 , -0.01258316],
      [-0.169515   ,  1.0824056 ]], dtype=float32)>

In this case, we are setting both the global and operation-level seed to ensure this result is reproducible. See tf.random.set_seed for more information.

Args:

shape: A 1-D integer Tensor or Python array. The shape of the output tensor. mean: A Tensor or Python value of type dtype, broadcastable with stddev.

The mean of the normal distribution.

stddev: A Tensor or Python value of type dtype, broadcastable with mean.: The standard deviation of the normal distribution.

dtype: The type of the output. seed: A Python integer. Used to create a random seed for the distribution.

See tf.random.set_seed for behavior.

Returns:

A tensor of the specified shape filled with random normal values.

d2l.tensorflow.numpy(x, *args, **kwargs)¶

d2l.tensorflow.ones(shape, dtype=tf.float32, name=None)[source]¶

Creates a tensor with all elements set to one (1).

See also tf.ones_like, tf.zeros, tf.fill, tf.eye.

This operation returns a tensor of type dtype with shape shape and all elements set to one.

>>> tf.ones([3, 4], tf.int32)
<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]], dtype=int32)>

Args:

shape: A list of integers, a tuple of integers, or: a 1-D Tensor of type int32.
dtype: Optional DType of an element in the resulting Tensor. Default is: tf.float32.

Returns:

A Tensor with all elements set to one (1).

d2l.tensorflow.plot(X, Y=None, xlabel=None, ylabel=None, legend=None, xlim=None, ylim=None, xscale='linear', yscale='linear', fmts=('-', 'm--', 'g-.', 'r:'), figsize=(3.5, 2.5), axes=None)[source]¶: Plot data points.

d2l.tensorflow.predict_ch3(net, test_iter, n=6)[source]¶: Predict labels (defined in Chapter 3).

d2l.tensorflow.predict_ch8(prefix, num_preds, net, vocab, params)[source]¶: Generate new characters following the prefix.

d2l.tensorflow.preprocess_nmt(text)[source]¶: Preprocess the English-French dataset.

d2l.tensorflow.rand(shape, minval=0, maxval=None, dtype=tf.float32, seed=None, name=None)¶

Outputs random values from a uniform distribution.

The generated values follow a uniform distribution in the range [minval, maxval). The lower bound minval is included in the range, while the upper bound maxval is excluded.

For floats, the default range is [0, 1). For ints, at least maxval must be specified explicitly.

In the integer case, the random integers are slightly biased unless maxval - minval is an exact power of two. The bias is small for values of maxval - minval significantly smaller than the range of the output (either 2**32 or 2**64).

Examples:

>>> tf.random.uniform(shape=[2])
<tf.Tensor: shape=(2,), dtype=float32, numpy=array([..., ...], dtype=float32)>
>>> tf.random.uniform(shape=[], minval=-1., maxval=0.)
<tf.Tensor: shape=(), dtype=float32, numpy=-...>
>>> tf.random.uniform(shape=[], minval=5, maxval=10, dtype=tf.int64)
<tf.Tensor: shape=(), dtype=int64, numpy=...>

The seed argument produces a deterministic sequence of tensors across multiple calls. To repeat that sequence, use tf.random.set_seed:

>>> tf.random.set_seed(5)
>>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10)
<tf.Tensor: shape=(), dtype=int32, numpy=2>
>>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10)
<tf.Tensor: shape=(), dtype=int32, numpy=0>
>>> tf.random.set_seed(5)
>>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10)
<tf.Tensor: shape=(), dtype=int32, numpy=2>
>>> tf.random.uniform(shape=[], maxval=3, dtype=tf.int32, seed=10)
<tf.Tensor: shape=(), dtype=int32, numpy=0>

Without tf.random.set_seed but with a seed argument is specified, small changes to function graphs or previously executed operations will change the returned value. See tf.random.set_seed for details.

Args:

shape: A 1-D integer Tensor or Python array. The shape of the output tensor. minval: A Tensor or Python value of type dtype, broadcastable with

shape (for integer types, broadcasting is not supported, so it needs to be a scalar). The lower bound on the range of random values to generate (inclusive). Defaults to 0.

maxval: A Tensor or Python value of type dtype, broadcastable with: shape (for integer types, broadcasting is not supported, so it needs to be a scalar). The upper bound on the range of random values to generate (exclusive). Defaults to 1 if dtype is floating point.
dtype: The type of the output: float16, float32, float64, int32,: or int64.
seed: A Python integer. Used in combination with tf.random.set_seed to: create a reproducible sequence of tensors across multiple calls.

Returns:

A tensor of the specified shape filled with random uniform values.

Raises:

ValueError: If dtype is integral and maxval is not specified.

d2l.tensorflow.read_data_nmt()[source]¶: Load the English-French dataset.

d2l.tensorflow.read_time_machine()[source]¶: Load the time machine dataset into a list of text lines.

d2l.tensorflow.reduce_sum(input_tensor, axis=None, keepdims=False, name=None)[source]¶

Computes the sum of elements across dimensions of a tensor.

Reduces input_tensor along the dimensions given in axis. Unless keepdims is true, the rank of the tensor is reduced by 1 for each entry in axis. If keepdims is true, the reduced dimensions are retained with length 1.

If axis is None, all dimensions are reduced, and a tensor with a single element is returned.

For example:

>>> # x has a shape of (2, 3) (two rows and three columns):
>>> x = tf.constant([[1, 1, 1], [1, 1, 1]])
>>> x.numpy()
array([[1, 1, 1],
       [1, 1, 1]], dtype=int32)
>>> # sum all the elements
>>> # 1 + 1 + 1 + 1 + 1+ 1 = 6
>>> tf.reduce_sum(x).numpy()
6
>>> # reduce along the first dimension
>>> # the result is [1, 1, 1] + [1, 1, 1] = [2, 2, 2]
>>> tf.reduce_sum(x, 0).numpy()
array([2, 2, 2], dtype=int32)
>>> # reduce along the second dimension
>>> # the result is [1, 1] + [1, 1] + [1, 1] = [3, 3]
>>> tf.reduce_sum(x, 1).numpy()
array([3, 3], dtype=int32)
>>> # keep the original dimensions
>>> tf.reduce_sum(x, 1, keepdims=True).numpy()
array([[3],
       [3]], dtype=int32)
>>> # reduce along both dimensions
>>> # the result is 1 + 1 + 1 + 1 + 1 + 1 = 6
>>> # or, equivalently, reduce along rows, then reduce the resultant array
>>> # [1, 1, 1] + [1, 1, 1] = [2, 2, 2]
>>> # 2 + 2 + 2 = 6
>>> tf.reduce_sum(x, [0, 1]).numpy()
6

Args:

input_tensor: The tensor to reduce. Should have numeric type. axis: The dimensions to reduce. If None (the default), reduces all

dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)].

keepdims: If true, retains reduced dimensions with length 1. name: A name for the operation (optional).

Returns:

The reduced tensor, of the same dtype as the input_tensor.

@compatibility(numpy) Equivalent to np.sum apart the fact that numpy upcast uint8 and int32 to int64 while tensorflow returns the same dtype as the input. @end_compatibility

d2l.tensorflow.reshape(tensor, shape, name=None)[source]¶

Reshapes a tensor.

Given tensor, this operation returns a new tf.Tensor that has the same values as tensor in the same order, except with a new shape given by shape.

>>> t1 = [[1, 2, 3],
...       [4, 5, 6]]
>>> print(tf.shape(t1).numpy())
[2 3]
>>> t2 = tf.reshape(t1, [6])
>>> t2
<tf.Tensor: shape=(6,), dtype=int32,
  numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)>
>>> tf.reshape(t2, [3, 2])
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
  array([[1, 2],
         [3, 4],
         [5, 6]], dtype=int32)>

The tf.reshape does not change the order of or the total number of elements in the tensor, and so it can reuse the underlying data buffer. This makes it a fast operation independent of how big of a tensor it is operating on.

>>> tf.reshape([1, 2, 3], [2, 2])
Traceback (most recent call last):
...
InvalidArgumentError: Input to reshape is a tensor with 3 values, but the
requested shape has 4

To instead reorder the data to rearrange the dimensions of a tensor, see tf.transpose.

>>> t = [[1, 2, 3],
...      [4, 5, 6]]
>>> tf.reshape(t, [3, 2]).numpy()
array([[1, 2],
       [3, 4],
       [5, 6]], dtype=int32)
>>> tf.transpose(t, perm=[1, 0]).numpy()
array([[1, 4],
       [2, 5],
       [3, 6]], dtype=int32)

If one component of shape is the special value -1, the size of that dimension is computed so that the total size remains constant. In particular, a shape of [-1] flattens into 1-D. At most one component of shape can be -1.

>>> t = [[1, 2, 3],
...      [4, 5, 6]]
>>> tf.reshape(t, [-1])
<tf.Tensor: shape=(6,), dtype=int32,
  numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)>
>>> tf.reshape(t, [3, -1])
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
  array([[1, 2],
         [3, 4],
         [5, 6]], dtype=int32)>
>>> tf.reshape(t, [-1, 2])
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
  array([[1, 2],
         [3, 4],
         [5, 6]], dtype=int32)>

tf.reshape(t, []) reshapes a tensor t with one element to a scalar.

>>> tf.reshape([7], []).numpy()
7

More examples:

>>> t = [1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(tf.shape(t).numpy())
[9]
>>> tf.reshape(t, [3, 3])
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
  array([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]], dtype=int32)>

>>> t = [[[1, 1], [2, 2]],
...      [[3, 3], [4, 4]]]
>>> print(tf.shape(t).numpy())
[2 2 2]
>>> tf.reshape(t, [2, 4])
<tf.Tensor: shape=(2, 4), dtype=int32, numpy=
  array([[1, 1, 2, 2],
         [3, 3, 4, 4]], dtype=int32)>

>>> t = [[[1, 1, 1],
...       [2, 2, 2]],
...      [[3, 3, 3],
...       [4, 4, 4]],
...      [[5, 5, 5],
...       [6, 6, 6]]]
>>> print(tf.shape(t).numpy())
[3 2 3]
>>> # Pass '[-1]' to flatten 't'.
>>> tf.reshape(t, [-1])
<tf.Tensor: shape=(18,), dtype=int32,
  numpy=array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6],
  dtype=int32)>
>>> # -- Using -1 to infer the shape --
>>> # Here -1 is inferred to be 9:
>>> tf.reshape(t, [2, -1])
<tf.Tensor: shape=(2, 9), dtype=int32, numpy=
  array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
         [4, 4, 4, 5, 5, 5, 6, 6, 6]], dtype=int32)>
>>> # -1 is inferred to be 2:
>>> tf.reshape(t, [-1, 9])
<tf.Tensor: shape=(2, 9), dtype=int32, numpy=
  array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
         [4, 4, 4, 5, 5, 5, 6, 6, 6]], dtype=int32)>
>>> # -1 is inferred to be 3:
>>> tf.reshape(t, [ 2, -1, 3])
<tf.Tensor: shape=(2, 3, 3), dtype=int32, numpy=
  array([[[1, 1, 1],
          [2, 2, 2],
          [3, 3, 3]],
         [[4, 4, 4],
          [5, 5, 5],
          [6, 6, 6]]], dtype=int32)>

Args:

tensor: A Tensor. shape: A Tensor. Must be one of the following types: int32, int64.

Defines the shape of the output tensor.

Returns:

A Tensor. Has the same type as tensor.

d2l.tensorflow.seq_data_iter_random(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using random sampling.

d2l.tensorflow.seq_data_iter_sequential(corpus, batch_size, num_steps)[source]¶: Generate a minibatch of subsequences using sequential partitioning.

d2l.tensorflow.set_axes(axes, xlabel, ylabel, xlim, ylim, xscale, yscale, legend)[source]¶: Set the axes for matplotlib.

d2l.tensorflow.set_figsize(figsize=(3.5, 2.5))[source]¶: Set the figure size for matplotlib.

d2l.tensorflow.sgd(params, grads, lr, batch_size)[source]¶: Minibatch stochastic gradient descent.

d2l.tensorflow.show_heatmaps(matrices, xlabel, ylabel, titles=None, figsize=(2.5, 2.5), cmap='Reds')[source]¶

d2l.tensorflow.show_images(imgs, num_rows, num_cols, titles=None, scale=1.5)[source]¶: Plot a list of images.

d2l.tensorflow.show_trace_2d(f, results)[source]¶: Show the trace of 2D variables during optimization.

d2l.tensorflow.sin(x, name=None)[source]¶

Computes sine of x element-wise.

Given an input tensor, this function computes sine of every element in the tensor. Input range is (-inf, inf) and output range is [-1,1].

`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 200, 10, float("inf")]) tf.math.sin(x) ==> [nan -0.4121185 -0.47942555 0.84147096 0.9320391 -0.87329733 -0.54402107 nan] `

Args:: x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
Returns:: A Tensor. Has the same type as x.

d2l.tensorflow.sinh(x, name=None)[source]¶

Computes hyperbolic sine of x element-wise.

Given an input tensor, this function computes hyperbolic sine of every element in the tensor. Input range is [-inf,inf] and output range is [-inf,inf].

`python x = tf.constant([-float("inf"), -9, -0.5, 1, 1.2, 2, 10, float("inf")]) tf.math.sinh(x) ==> [-inf -4.0515420e+03 -5.2109528e-01 1.1752012e+00 1.5094614e+00 3.6268604e+00 1.1013232e+04 inf] `

Args:: x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, complex64, complex128. name: A name for the operation (optional).
Returns:: A Tensor. Has the same type as x.

d2l.tensorflow.size(a)¶

d2l.tensorflow.squared_loss(y_hat, y)[source]¶: Squared loss.

d2l.tensorflow.stack(values, axis=0, name='stack')[source]¶

Stacks a list of rank-R tensors into one rank-(R+1) tensor.

19.7. Documento da API d2l¶ Colab [mxnet] Open the notebook in Colab Colab [pytorch] Open the notebook in Colab Colab [tensorflow] Open the notebook in Colab SageMaker Studio Lab Open the notebook in SageMaker Studio Lab

19.7. Documento da API `d2l`¶

Open the notebook in Colab

Open the notebook in Colab

Open the notebook in Colab

Open the notebook in SageMaker Studio Lab