API - 数据预处理

我们提供大量的数据增强及处理方法,若需要对图像进行仿射变换(Affine Transformation),请参考 Python Can Be Fast 的方法,并结合`tf.py_function`一起使用。

threading_data([data, fn, thread_count])

Process a batch of data by given function by threading.

rotation(x[, rg, is_random, row_index, ...])

Rotate an image randomly or non-randomly.

rotation_multi(x[, rg, is_random, ...])

Rotate multiple images with the same arguments, randomly or non-randomly.

crop(x, wrg, hrg[, is_random, row_index, ...])

Randomly or centrally crop an image.

crop_multi(x, wrg, hrg[, is_random, ...])

Randomly or centrally crop multiple images.

flip_axis(x[, axis, is_random])

Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,

flip_axis_multi(x, axis[, is_random])

Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,

shift(x[, wrg, hrg, is_random, row_index, ...])

Shift an image randomly or non-randomly.

shift_multi(x[, wrg, hrg, is_random, ...])

Shift images with the same arguments, randomly or non-randomly.

shear(x[, intensity, is_random, row_index, ...])

Shear an image randomly or non-randomly.

shear_multi(x[, intensity, is_random, ...])

Shear images with the same arguments, randomly or non-randomly.

shear2(x[, shear, is_random, row_index, ...])

Shear an image randomly or non-randomly.

shear_multi2(x[, shear, is_random, ...])

Shear images with the same arguments, randomly or non-randomly.

swirl(x[, center, strength, radius, ...])

Swirl an image randomly or non-randomly, see scikit-image swirl API and example.

swirl_multi(x[, center, strength, radius, ...])

Swirl multiple images with the same arguments, randomly or non-randomly.

elastic_transform(x, alpha, sigma[, mode, ...])

Elastic transformation for image as described in [Simard2003].

elastic_transform_multi(x, alpha, sigma[, ...])

Elastic transformation for images as described in [Simard2003].

zoom(x[, zoom_range, flags, border_mode])

Zooming/Scaling a single image that height and width are changed together.

zoom_multi(x[, zoom_range, flags, border_mode])

Zoom in and out of images with the same arguments, randomly or non-randomly.

brightness(x[, gamma, gain, is_random])

Change the brightness of a single image, randomly or non-randomly.

brightness_multi(x[, gamma, gain, is_random])

Change the brightness of multiply images, randomly or non-randomly.

illumination(x[, gamma, contrast, ...])

Perform illumination augmentation for a single image, randomly or non-randomly.

rgb_to_hsv(rgb)

Input RGB image [0~255] return HSV image [0~1].

hsv_to_rgb(hsv)

Input HSV image [0~1] return RGB image [0~255].

adjust_hue(im[, hout, is_offset, is_clip, ...])

Adjust hue of an RGB image.

imresize(x[, size, interp, mode])

Resize an image by given output size and method.

pixel_value_scale(im[, val, clip, is_random])

Scales each value in the pixels of the image.

samplewise_norm(x[, rescale, ...])

Normalize an image by rescale, samplewise centering and samplewise centering in order.

featurewise_norm(x[, mean, std, epsilon])

Normalize every pixels by the same given mean and std, which are usually compute from all examples.

channel_shift(x, intensity[, is_random, ...])

Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.

channel_shift_multi(x, intensity[, ...])

Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis.

drop(x[, keep])

Randomly set some pixels to zero by a given keeping probability.

transform_matrix_offset_center(matrix, y, x)

Convert the matrix from Cartesian coordinates (the origin in the middle of image) to Image coordinates (the origin on the top-left of image).

apply_transform(x, transform_matrix[, ...])

Return transformed images by given an affine matrix in Scipy format (x is height).

projective_transform_by_points(x, src, dst)

Projective transform by given coordinates, usually 4 coordinates.

array_to_img(x[, dim_ordering, scale])

Converts a numpy array to PIL image object (uint8 format).

find_contours(x[, level, fully_connected, ...])

Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skimage.measure.find_contours.

pt2map([list_points, size, val])

Inputs a list of points, return a 2D image.

binary_dilation(x[, radius])

Return fast binary morphological dilation of an image.

dilation(x[, radius])

Return greyscale morphological dilation of an image, see skimage.morphology.dilation.

binary_erosion(x[, radius])

Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.

erosion(x[, radius])

Return greyscale morphological erosion of an image, see skimage.morphology.erosion.

obj_box_coord_rescale([coord, shape])

Scale down one coordinates from pixel unit to the ratio of image size i.e.

obj_box_coords_rescale([coords, shape])

Scale down a list of coordinates from pixel unit to the ratio of image size i.e.

obj_box_coord_scale_to_pixelunit(coord[, shape])

Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format.

obj_box_coord_centroid_to_upleft_butright(coord)

Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.

obj_box_coord_upleft_butright_to_centroid(coord)

Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h].

obj_box_coord_centroid_to_upleft(coord)

Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h].

obj_box_coord_upleft_to_centroid(coord)

Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h].

parse_darknet_ann_str_to_list(annotations)

Input string format of class, x, y, w, h, return list of list format.

parse_darknet_ann_list_to_cls_box(annotations)

Parse darknet annotation format into two lists for class and bounding box.

obj_box_horizontal_flip(im[, coords, ...])

Left-right flip the image and coordinates for object detection.

obj_box_imresize(im[, coords, size, interp, ...])

Resize an image, and compute the new bounding box coordinates.

obj_box_crop(im[, classes, coords, wrg, ...])

Randomly or centrally crop an image, and compute the new bounding box coordinates.

obj_box_shift(im[, classes, coords, wrg, ...])

Shift an image randomly or non-randomly, and compute the new bounding box coordinates.

obj_box_zoom(im[, classes, coords, ...])

Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates.

keypoint_random_crop(image, annos[, mask, size])

Randomly crop an image and corresponding keypoints without influence scales, given by keypoint_random_resize_shortestedge.

keypoint_random_crop2

keypoint_random_rotate(image, annos[, mask, rg])

Rotate an image and corresponding keypoints.

keypoint_random_flip(image, annos[, mask, ...])

Flip an image and corresponding keypoints.

keypoint_random_resize(image, annos[, mask, ...])

Randomly resize an image and corresponding keypoints.

keypoint_random_resize_shortestedge(image, annos)

Randomly resize an image and corresponding keypoints based on shorter edgeself.

pad_sequences(sequences[, maxlen, dtype, ...])

Pads each sequence to the same length: the length of the longest sequence.

remove_pad_sequences(sequences[, pad_id])

Remove padding.

process_sequences(sequences[, end_id, ...])

Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch.

sequences_add_start_id(sequences[, ...])

Add special start token(id) in the beginning of each sequence.

sequences_add_end_id(sequences[, end_id])

Add special end token(id) in the end of each sequence.

sequences_add_end_id_after_pad(sequences[, ...])

Add special end token(id) in the end of each sequence.

sequences_get_mask(sequences[, pad_val])

Return mask for sequences.

并行 Threading

对于当前的版本,我们建议使用`tf.data`和`tf.py_function`来实现训练数据处理,`threading_data`只适合需要处理一次的数据,请参考我们Github中CIFAR10的例子。

tensorlayer.prepro.threading_data(data=None, fn=None, thread_count=None, **kwargs)[源代码]

Process a batch of data by given function by threading.

Usually be used for data augmentation.

参数
  • data (numpy.array or others) -- The data to be processed.

  • thread_count (int) -- The number of threads to use.

  • fn (function) -- The function for data processing.

  • args (more) -- Ssee Examples below.

实际案例

Process images.

>>> images, _, _, _ = tl.files.load_cifar10_dataset(shape=(-1, 32, 32, 3))
>>> images = tl.prepro.threading_data(images[0:32], tl.prepro.zoom, zoom_range=[0.5, 1])

Customized image preprocessing function.

>>> def distort_img(x):
>>>     x = tl.prepro.flip_axis(x, axis=0, is_random=True)
>>>     x = tl.prepro.flip_axis(x, axis=1, is_random=True)
>>>     x = tl.prepro.crop(x, 100, 100, is_random=True)
>>>     return x
>>> images = tl.prepro.threading_data(images, distort_img)

Process images and masks together (Usually be used for image segmentation).

>>> X, Y --> [batch_size, row, col, 1]
>>> data = tl.prepro.threading_data([_ for _ in zip(X, Y)], tl.prepro.zoom_multi, zoom_range=[0.5, 1], is_random=True)
data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
X_, Y_ --> [batch_size, row, col, 1]
>>> tl.vis.save_image(X_, 'images.png')
>>> tl.vis.save_image(Y_, 'masks.png')

Process images and masks together by using thread_count.

>>> X, Y --> [batch_size, row, col, 1]
>>> data = tl.prepro.threading_data(X, tl.prepro.zoom_multi, 8, zoom_range=[0.5, 1], is_random=True)
data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
X_, Y_ --> [batch_size, row, col, 1]
>>> tl.vis.save_image(X_, 'after.png')
>>> tl.vis.save_image(Y_, 'before.png')

Customized function for processing images and masks together.

>>> def distort_img(data):
>>>    x, y = data
>>>    x, y = tl.prepro.flip_axis_multi([x, y], axis=0, is_random=True)
>>>    x, y = tl.prepro.flip_axis_multi([x, y], axis=1, is_random=True)
>>>    x, y = tl.prepro.crop_multi([x, y], 100, 100, is_random=True)
>>>    return x, y
>>> X, Y --> [batch_size, row, col, channel]
>>> data = tl.prepro.threading_data([_ for _ in zip(X, Y)], distort_img)
>>> X_, Y_ = data.transpose((1,0,2,3,4))
返回

The processed results.

返回类型

list or numpyarray

引用

图像

  • 这些函数只对一个图像做处理, 使用 threading_data 函数来实现多线程处理,请参考 tutorial_image_preprocess.py

  • 所有函数都有一个 is_random

  • 所有结尾是 multi 的函数通常用于图像分隔,因为输入和输出的图像必需是匹配的。

旋转

tensorlayer.prepro.rotation(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Rotate an image randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • rg (int or float) -- Degree to rotate, usually 0 ~ 180.

  • is_random (boolean) -- If True, randomly rotate. Default is False

  • col_index and channel_index (row_index) -- Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

  • fill_mode (str) -- Method to fill missing pixel, default nearest, more options constant, reflect or wrap, see scipy ndimage affine_transform

  • cval (float) -- Value used for points outside the boundaries of the input if mode=`constant`. Default is 0.0

  • order (int) -- The order of interpolation. The order has to be in the range 0-5. See tl.prepro.affine_transform and scipy ndimage affine_transform

返回

A processed image.

返回类型

numpy.array

实际案例

>>> x --> [row, col, 1]
>>> x = tl.prepro.rotation(x, rg=40, is_random=False)
>>> tl.vis.save_image(x, 'im.png')
tensorlayer.prepro.rotation_multi(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Rotate multiple images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.rotation.

返回

A list of processed images.

返回类型

numpy.array

实际案例

>>> x, y --> [row, col, 1]  greyscale
>>> x, y = tl.prepro.rotation_multi([x, y], rg=90, is_random=False)

裁剪

tensorlayer.prepro.crop(x, wrg, hrg, is_random=False, row_index=0, col_index=1)[源代码]

Randomly or centrally crop an image.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • wrg (int) -- Size of width.

  • hrg (int) -- Size of height.

  • is_random (boolean,) -- If True, randomly crop, else central crop. Default is False.

  • row_index (int) -- index of row.

  • col_index (int) -- index of column.

返回

A processed image.

返回类型

numpy.array

tensorlayer.prepro.crop_multi(x, wrg, hrg, is_random=False, row_index=0, col_index=1)[源代码]

Randomly or centrally crop multiple images.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.crop.

返回

A list of processed images.

返回类型

numpy.array

翻转

tensorlayer.prepro.flip_axis(x, axis=1, is_random=False)[源代码]

Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • axis (int) --

    Which axis to flip.
    • 0, flip up and down

    • 1, flip left and right

    • 2, flip channel

  • is_random (boolean) -- If True, randomly flip. Default is False.

返回

A processed image.

返回类型

numpy.array

tensorlayer.prepro.flip_axis_multi(x, axis, is_random=False)[源代码]

Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.flip_axis.

返回

A list of processed images.

返回类型

numpy.array

位移

tensorlayer.prepro.shift(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shift an image randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • wrg (float) -- Percentage of shift in axis x, usually -0.25 ~ 0.25.

  • hrg (float) -- Percentage of shift in axis y, usually -0.25 ~ 0.25.

  • is_random (boolean) -- If True, randomly shift. Default is False.

  • col_index and channel_index (row_index) -- Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

  • fill_mode (str) -- Method to fill missing pixel, default nearest, more options constant, reflect or wrap, see scipy ndimage affine_transform

  • cval (float) -- Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

  • order (int) -- The order of interpolation. The order has to be in the range 0-5. See tl.prepro.affine_transform and scipy ndimage affine_transform

返回

A processed image.

返回类型

numpy.array

tensorlayer.prepro.shift_multi(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shift images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.shift.

返回

A list of processed images.

返回类型

numpy.array

切变

tensorlayer.prepro.shear(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shear an image randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • intensity (float) -- Percentage of shear, usually -0.5 ~ 0.5 (is_random==True), 0 ~ 0.5 (is_random==False), you can have a quick try by shear(X, 1).

  • is_random (boolean) -- If True, randomly shear. Default is False.

  • col_index and channel_index (row_index) -- Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

  • fill_mode (str) -- Method to fill missing pixel, default nearest, more options constant, reflect or wrap, see and scipy ndimage affine_transform

  • cval (float) -- Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

  • order (int) -- The order of interpolation. The order has to be in the range 0-5. See tl.prepro.affine_transform and scipy ndimage affine_transform

返回

A processed image.

返回类型

numpy.array

引用

tensorlayer.prepro.shear_multi(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.shear.

返回

A list of processed images.

返回类型

numpy.array

切变 V2

tensorlayer.prepro.shear2(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shear an image randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • shear (tuple of two floats) -- Percentage of shear for height and width direction (0, 1).

  • is_random (boolean) -- If True, randomly shear. Default is False.

  • col_index and channel_index (row_index) -- Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

  • fill_mode (str) -- Method to fill missing pixel, default nearest, more options constant, reflect or wrap, see scipy ndimage affine_transform

  • cval (float) -- Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

  • order (int) -- The order of interpolation. The order has to be in the range 0-5. See tl.prepro.affine_transform and scipy ndimage affine_transform

返回

A processed image.

返回类型

numpy.array

引用

tensorlayer.prepro.shear_multi2(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代码]

Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.shear2.

返回

A list of processed images.

返回类型

numpy.array

漩涡

tensorlayer.prepro.swirl(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[源代码]

Swirl an image randomly or non-randomly, see scikit-image swirl API and example.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • center (tuple or 2 int or None) -- Center coordinate of transformation (optional).

  • strength (float) -- The amount of swirling applied.

  • radius (float) -- The extent of the swirl in pixels. The effect dies out rapidly beyond radius.

  • rotation (float) -- Additional rotation applied to the image, usually [0, 360], relates to center.

  • output_shape (tuple of 2 int or None) -- Shape of the output image generated (height, width). By default the shape of the input image is preserved.

  • order (int, optional) -- The order of the spline interpolation, default is 1. The order has to be in the range 0-5. See skimage.transform.warp for detail.

  • mode (str) -- One of constant (default), edge, symmetric reflect and wrap. Points outside the boundaries of the input are filled according to the given mode, with constant used as the default. Modes match the behaviour of numpy.pad.

  • cval (float) -- Used in conjunction with mode constant, the value outside the image boundaries.

  • clip (boolean) -- Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.

  • preserve_range (boolean) -- Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.

  • is_random (boolean,) --

    If True, random swirl. Default is False.
    • random center = [(0 ~ x.shape[0]), (0 ~ x.shape[1])]

    • random strength = [0, strength]

    • random radius = [1e-10, radius]

    • random rotation = [-rotation, rotation]

返回

A processed image.

返回类型

numpy.array

实际案例

>>> x --> [row, col, 1] greyscale
>>> x = tl.prepro.swirl(x, strength=4, radius=100)
tensorlayer.prepro.swirl_multi(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[源代码]

Swirl multiple images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.swirl.

返回

A list of processed images.

返回类型

numpy.array

局部扭曲(Elastic transform)

tensorlayer.prepro.elastic_transform(x, alpha, sigma, mode='constant', cval=0, is_random=False)[源代码]

Elastic transformation for image as described in [Simard2003].

参数
  • x (numpy.array) -- A greyscale image.

  • alpha (float) -- Alpha value for elastic transformation.

  • sigma (float or sequence of float) -- The smaller the sigma, the more transformation. Standard deviation for Gaussian kernel. The standard deviations of the Gaussian filter are given for each axis as a sequence, or as a single number, in which case it is equal for all axes.

  • mode (str) -- See scipy.ndimage.filters.gaussian_filter. Default is constant.

  • cval (float,) -- Used in conjunction with mode of constant, the value outside the image boundaries.

  • is_random (boolean) -- Default is False.

返回

A processed image.

返回类型

numpy.array

实际案例

>>> x = tl.prepro.elastic_transform(x, alpha=x.shape[1]*3, sigma=x.shape[1]*0.07)

引用

tensorlayer.prepro.elastic_transform_multi(x, alpha, sigma, mode='constant', cval=0, is_random=False)[源代码]

Elastic transformation for images as described in [Simard2003].

参数
  • x (list of numpy.array) -- List of greyscale images.

  • others (args) -- See tl.prepro.elastic_transform.

返回

A list of processed images.

返回类型

numpy.array

缩放

tensorlayer.prepro.zoom(x, zoom_range=(0.9, 1.1), flags=None, border_mode='constant')[源代码]

Zooming/Scaling a single image that height and width are changed together.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • zoom_range (float or tuple of 2 floats) --

    The zooming/scaling ratio, greater than 1 means larger.
    • float, a fixed ratio.

    • tuple of 2 floats, randomly sample a value as the ratio between 2 values.

  • border_mode (str) --

    • constant, pad the image with a constant value (i.e. black or 0)

    • replicate, the row or column at the very edge of the original is replicated to the extra border.

返回

A processed image.

返回类型

numpy.array

tensorlayer.prepro.zoom_multi(x, zoom_range=(0.9, 1.1), flags=None, border_mode='constant')[源代码]

Zoom in and out of images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.zoom.

返回

A list of processed images.

返回类型

numpy.array

亮度

tensorlayer.prepro.brightness(x, gamma=1, gain=1, is_random=False)[源代码]

Change the brightness of a single image, randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • gamma (float) --

    Non negative real number. Default value is 1.
    • Small than 1 means brighter.

    • If is_random is True, gamma in a range of (1-gamma, 1+gamma).

  • gain (float) -- The constant multiplier. Default value is 1.

  • is_random (boolean) -- If True, randomly change brightness. Default is False.

返回

A processed image.

返回类型

numpy.array

引用

tensorlayer.prepro.brightness_multi(x, gamma=1, gain=1, is_random=False)[源代码]

Change the brightness of multiply images, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpyarray) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.brightness.

返回

A list of processed images.

返回类型

numpy.array

亮度, 饱和度, 对比度

tensorlayer.prepro.illumination(x, gamma=1.0, contrast=1.0, saturation=1.0, is_random=False)[源代码]

Perform illumination augmentation for a single image, randomly or non-randomly.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • gamma (float) --

    Change brightness (the same with tl.prepro.brightness)
    • if is_random=False, one float number, small than one means brighter, greater than one means darker.

    • if is_random=True, tuple of two float numbers, (min, max).

  • contrast (float) --

    Change contrast.
    • if is_random=False, one float number, small than one means blur.

    • if is_random=True, tuple of two float numbers, (min, max).

  • saturation (float) --

    Change saturation.
    • if is_random=False, one float number, small than one means unsaturation.

    • if is_random=True, tuple of two float numbers, (min, max).

  • is_random (boolean) -- If True, randomly change illumination. Default is False.

返回

A processed image.

返回类型

numpy.array

实际案例

Random

>>> x = tl.prepro.illumination(x, gamma=(0.5, 5.0), contrast=(0.3, 1.0), saturation=(0.7, 1.0), is_random=True)

Non-random

>>> x = tl.prepro.illumination(x, 0.5, 0.6, 0.8, is_random=False)

RGB 转 HSV

tensorlayer.prepro.rgb_to_hsv(rgb)[源代码]

Input RGB image [0~255] return HSV image [0~1].

参数

rgb (numpy.array) -- An image with values between 0 and 255.

返回

A processed image.

返回类型

numpy.array

HSV 转 RGB

tensorlayer.prepro.hsv_to_rgb(hsv)[源代码]

Input HSV image [0~1] return RGB image [0~255].

参数

hsv (numpy.array) -- An image with values between 0.0 and 1.0

返回

A processed image.

返回类型

numpy.array

调整色调(Hue)

tensorlayer.prepro.adjust_hue(im, hout=0.66, is_offset=True, is_clip=True, is_random=False)[源代码]

Adjust hue of an RGB image.

This is a convenience method that converts an RGB image to float representation, converts it to HSV, add an offset to the hue channel, converts back to RGB and then back to the original data type. For TF, see tf.image.adjust_hue.and tf.image.random_hue.

参数
  • im (numpy.array) -- An image with values between 0 and 255.

  • hout (float) --

    The scale value for adjusting hue.
    • If is_offset is False, set all hue values to this value. 0 is red; 0.33 is green; 0.66 is blue.

    • If is_offset is True, add this value as the offset to the hue channel.

  • is_offset (boolean) -- Whether hout is added on HSV as offset or not. Default is True.

  • is_clip (boolean) -- If HSV value smaller than 0, set to 0. Default is True.

  • is_random (boolean) -- If True, randomly change hue. Default is False.

返回

A processed image.

返回类型

numpy.array

实际案例

Random, add a random value between -0.2 and 0.2 as the offset to every hue values.

>>> im_hue = tl.prepro.adjust_hue(image, hout=0.2, is_offset=True, is_random=False)

Non-random, make all hue to green.

>>> im_green = tl.prepro.adjust_hue(image, hout=0.66, is_offset=False, is_random=False)

引用

调整大小

tensorlayer.prepro.imresize(x, size=None, interp='bicubic', mode=None)[源代码]

Resize an image by given output size and method.

Warning, this function will rescale the value to [0, 255].

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • size (list of 2 int or None) -- For height and width.

  • interp (str) -- Interpolation method for re-sizing (nearest, lanczos, bilinear, bicubic (default) or cubic).

  • mode (str) -- The PIL image mode (P, L, etc.) to convert image before resizing.

返回

A processed image.

返回类型

numpy.array

引用

像素值缩放

tensorlayer.prepro.pixel_value_scale(im, val=0.9, clip=None, is_random=False)[源代码]

Scales each value in the pixels of the image.

参数
  • im (numpy.array) -- An image.

  • val (float) --

    The scale value for changing pixel value.
    • If is_random=False, multiply this value with all pixels.

    • If is_random=True, multiply a value between [1-val, 1+val] with all pixels.

  • clip (tuple of 2 numbers) -- The minimum and maximum value.

  • is_random (boolean) -- If True, see val.

返回

A processed image.

返回类型

numpy.array

实际案例

Random

>>> im = pixel_value_scale(im, 0.1, [0, 255], is_random=True)

Non-random

>>> im = pixel_value_scale(im, 0.9, [0, 255], is_random=False)

正规化

tensorlayer.prepro.samplewise_norm(x, rescale=None, samplewise_center=False, samplewise_std_normalization=False, channel_index=2, epsilon=1e-07)[源代码]

Normalize an image by rescale, samplewise centering and samplewise centering in order.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • rescale (float) -- Rescaling factor. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (before applying any other transformation)

  • samplewise_center (boolean) -- If True, set each sample mean to 0.

  • samplewise_std_normalization (boolean) -- If True, divide each input by its std.

  • epsilon (float) -- A small position value for dividing standard deviation.

返回

A processed image.

返回类型

numpy.array

实际案例

>>> x = samplewise_norm(x, samplewise_center=True, samplewise_std_normalization=True)
>>> print(x.shape, np.mean(x), np.std(x))
(160, 176, 1), 0.0, 1.0

提示

When samplewise_center and samplewise_std_normalization are True. - For greyscale image, every pixels are subtracted and divided by the mean and std of whole image. - For RGB image, every pixels are subtracted and divided by the mean and std of this pixel i.e. the mean and std of a pixel is 0 and 1.

tensorlayer.prepro.featurewise_norm(x, mean=None, std=None, epsilon=1e-07)[源代码]

Normalize every pixels by the same given mean and std, which are usually compute from all examples.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • mean (float) -- Value for subtraction.

  • std (float) -- Value for division.

  • epsilon (float) -- A small position value for dividing standard deviation.

返回

A processed image.

返回类型

numpy.array

通道位移

tensorlayer.prepro.channel_shift(x, intensity, is_random=False, channel_index=2)[源代码]

Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • intensity (float) -- Intensity of shifting.

  • is_random (boolean) -- If True, randomly shift. Default is False.

  • channel_index (int) -- Index of channel. Default is 2.

返回

A processed image.

返回类型

numpy.array

tensorlayer.prepro.channel_shift_multi(x, intensity, is_random=False, channel_index=2)[源代码]

Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

参数
  • x (list of numpy.array) -- List of images with dimension of [n_images, row, col, channel] (default).

  • others (args) -- See tl.prepro.channel_shift.

返回

A list of processed images.

返回类型

numpy.array

噪声

tensorlayer.prepro.drop(x, keep=0.5)[源代码]

Randomly set some pixels to zero by a given keeping probability.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] or [row, col].

  • keep (float) -- The keeping probability (0, 1), the lower more values will be set to zero.

返回

A processed image.

返回类型

numpy.array

矩阵圆心转换到图中央

tensorlayer.prepro.transform_matrix_offset_center(matrix, y, x)[源代码]

Convert the matrix from Cartesian coordinates (the origin in the middle of image) to Image coordinates (the origin on the top-left of image).

参数
  • matrix (numpy.array) -- Transform matrix.

  • and y (x) -- Size of image.

返回

The transform matrix.

返回类型

numpy.array

实际案例

  • See tl.prepro.rotation, tl.prepro.shear, tl.prepro.zoom.

基于矩阵的仿射变换

tensorlayer.prepro.apply_transform(x, transform_matrix, channel_index=2, fill_mode='nearest', cval=0.0, order=1)

Return transformed images by given an affine matrix in Scipy format (x is height).

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • transform_matrix (numpy.array) -- Transform matrix (offset center), can be generated by transform_matrix_offset_center

  • channel_index (int) -- Index of channel, default 2.

  • fill_mode (str) -- Method to fill missing pixel, default nearest, more options constant, reflect or wrap, see scipy ndimage affine_transform

  • cval (float) -- Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0

  • order (int) --

    The order of interpolation. The order has to be in the range 0-5:

返回

A processed image.

返回类型

numpy.array

实际案例

>>> M_shear = tl.prepro.affine_shear_matrix(intensity=0.2, is_random=False)
>>> M_zoom = tl.prepro.affine_zoom_matrix(zoom_range=0.8)
>>> M_combined = M_shear.dot(M_zoom)
>>> transform_matrix = tl.prepro.transform_matrix_offset_center(M_combined, h, w)
>>> result = tl.prepro.affine_transform(image, transform_matrix)

基于坐标点的的投影变换

tensorlayer.prepro.projective_transform_by_points(x, src, dst, map_args=None, output_shape=None, order=1, mode='constant', cval=0.0, clip=True, preserve_range=False)[源代码]

Projective transform by given coordinates, usually 4 coordinates.

see scikit-image.

参数
  • x (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • src (list or numpy) -- The original coordinates, usually 4 coordinates of (width, height).

  • dst (list or numpy) -- The coordinates after transformation, the number of coordinates is the same with src.

  • map_args (dictionary or None) -- Keyword arguments passed to inverse map.

  • output_shape (tuple of 2 int) -- Shape of the output image generated. By default the shape of the input image is preserved. Note that, even for multi-band images, only rows and columns need to be specified.

  • order (int) --

    The order of interpolation. The order has to be in the range 0-5:
    • 0 Nearest-neighbor

    • 1 Bi-linear (default)

    • 2 Bi-quadratic

    • 3 Bi-cubic

    • 4 Bi-quartic

    • 5 Bi-quintic

  • mode (str) -- One of constant (default), edge, symmetric, reflect or wrap. Points outside the boundaries of the input are filled according to the given mode. Modes match the behaviour of numpy.pad.

  • cval (float) -- Used in conjunction with mode constant, the value outside the image boundaries.

  • clip (boolean) -- Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.

  • preserve_range (boolean) -- Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.

返回

A processed image.

返回类型

numpy.array

实际案例

Assume X is an image from CIFAR-10, i.e. shape == (32, 32, 3)

>>> src = [[0,0],[0,32],[32,0],[32,32]]     # [w, h]
>>> dst = [[10,10],[0,32],[32,0],[32,32]]
>>> x = tl.prepro.projective_transform_by_points(X, src, dst)

引用

Numpy 与 PIL

tensorlayer.prepro.array_to_img(x, dim_ordering=(0, 1, 2), scale=True)[源代码]

Converts a numpy array to PIL image object (uint8 format).

参数
  • x (numpy.array) -- An image with dimension of 3 and channels of 1 or 3.

  • dim_ordering (tuple of 3 int) -- Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

  • scale (boolean) -- If True, converts image to [0, 255] from any range of value like [-1, 2]. Default is True.

返回

An image.

返回类型

PIL.image

引用

PIL Image.fromarray

找轮廓

tensorlayer.prepro.find_contours(x, level=0.8, fully_connected='low', positive_orientation='low')[源代码]

Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skimage.measure.find_contours.

参数
  • x (2D ndarray of double.) -- Input data in which to find contours.

  • level (float) -- Value along which to find contours in the array.

  • fully_connected (str) -- Either low or high. Indicates whether array elements below the given level value are to be considered fully-connected (and hence elements above the value will only be face connected), or vice-versa. (See notes below for details.)

  • positive_orientation (str) -- Either low or high. Indicates whether the output contours will produce positively-oriented polygons around islands of low- or high-valued elements. If low then contours will wind counter-clockwise around elements below the iso-value. Alternately, this means that low-valued elements are always on the left of the contour.

返回

Each contour is an ndarray of shape (n, 2), consisting of n (row, column) coordinates along the contour.

返回类型

list of (n,2)-ndarrays

一列点到图

tensorlayer.prepro.pt2map(list_points=None, size=(100, 100), val=1)[源代码]

Inputs a list of points, return a 2D image.

参数
  • list_points (list of 2 int) -- [[x, y], [x, y]..] for point coordinates.

  • size (tuple of 2 int) -- (w, h) for output size.

  • val (float or int) -- For the contour value.

返回

An image.

返回类型

numpy.array

二值膨胀

tensorlayer.prepro.binary_dilation(x, radius=3)[源代码]

Return fast binary morphological dilation of an image. see skimage.morphology.binary_dilation.

参数
  • x (2D array) -- A binary image.

  • radius (int) -- For the radius of mask.

返回

A processed binary image.

返回类型

numpy.array

灰度膨胀

tensorlayer.prepro.dilation(x, radius=3)[源代码]

Return greyscale morphological dilation of an image, see skimage.morphology.dilation.

参数
  • x (2D array) -- An greyscale image.

  • radius (int) -- For the radius of mask.

返回

A processed greyscale image.

返回类型

numpy.array

二值腐蚀

tensorlayer.prepro.binary_erosion(x, radius=3)[源代码]

Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.

参数
  • x (2D array) -- A binary image.

  • radius (int) -- For the radius of mask.

返回

A processed binary image.

返回类型

numpy.array

灰度腐蚀

tensorlayer.prepro.erosion(x, radius=3)[源代码]

Return greyscale morphological erosion of an image, see skimage.morphology.erosion.

参数
  • x (2D array) -- A greyscale image.

  • radius (int) -- For the radius of mask.

返回

A processed greyscale image.

返回类型

numpy.array

目标检测

教程-图像增强

您好,这是基于VOC数据集的一个图像增强例子,请阅读这篇 知乎文章

import tensorlayer as tl

## 下载 VOC 2012 数据集
imgs_file_list, _, _, _, classes, _, _,\
    _, objs_info_list, _ = tl.files.load_voc_dataset(dataset="2012")

## 图片标记预处理为列表形式
ann_list = []
for info in objs_info_list:
    ann = tl.prepro.parse_darknet_ann_str_to_list(info)
    c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
    ann_list.append([c, b])

# 读取一张图片,并保存
idx = 2  # 可自行选择图片
image = tl.vis.read_image(imgs_file_list[idx])
tl.vis.draw_boxes_and_labels_to_image(image, ann_list[idx][0],
     ann_list[idx][1], [], classes, True, save_name='_im_original.png')

# 左右翻转
im_flip, coords = tl.prepro.obj_box_horizontal_flip(image,
        ann_list[idx][1], is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_flip, ann_list[idx][0],
        coords, [], classes, True, save_name='_im_flip.png')

# 调整图片大小
im_resize, coords = tl.prepro.obj_box_imresize(image,
        coords=ann_list[idx][1], size=[300, 200], is_rescale=True)
tl.vis.draw_boxes_and_labels_to_image(im_resize, ann_list[idx][0],
        coords, [], classes, True, save_name='_im_resize.png')

# 裁剪
im_crop, clas, coords = tl.prepro.obj_box_crop(image, ann_list[idx][0],
         ann_list[idx][1], wrg=200, hrg=200,
         is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_crop, clas, coords, [],
         classes, True, save_name='_im_crop.png')

# 位移
im_shfit, clas, coords = tl.prepro.obj_box_shift(image, ann_list[idx][0],
        ann_list[idx][1], wrg=0.1, hrg=0.1,
        is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_shfit, clas, coords, [],
        classes, True, save_name='_im_shift.png')

# 高宽缩放
im_zoom, clas, coords = tl.prepro.obj_box_zoom(image, ann_list[idx][0],
        ann_list[idx][1], zoom_range=(1.3, 0.7),
        is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_zoom, clas, coords, [],
        classes, True, save_name='_im_zoom.png')

实际中,你可能希望如下使用多线程方式来处理一个batch的数据。

import tensorlayer as tl
import random

batch_size = 64
im_size = [416, 416]
n_data = len(imgs_file_list)
jitter = 0.2
def _data_pre_aug_fn(data):
    im, ann = data
    clas, coords = ann
    ## 随机改变图片亮度、对比度和饱和度
    im = tl.prepro.illumination(im, gamma=(0.5, 1.5),
             contrast=(0.5, 1.5), saturation=(0.5, 1.5), is_random=True)
    ## 随机左右翻转
    im, coords = tl.prepro.obj_box_horizontal_flip(im, coords,
             is_rescale=True, is_center=True, is_random=True)
    ## 随机调整大小并裁剪出指定大小的图片,这同时达到了随机缩放的效果
    tmp0 = random.randint(1, int(im_size[0]*jitter))
    tmp1 = random.randint(1, int(im_size[1]*jitter))
    im, coords = tl.prepro.obj_box_imresize(im, coords,
            [im_size[0]+tmp0, im_size[1]+tmp1], is_rescale=True,
             interp='bicubic')
    im, clas, coords = tl.prepro.obj_box_crop(im, clas, coords,
             wrg=im_size[1], hrg=im_size[0], is_rescale=True,
             is_center=True, is_random=True)
    ## 把数值范围从 [0, 255] 转到 [-1, 1] (可选)
    im = im / 127.5 - 1
    return im, [clas, coords]

# 随机读取一个batch的图片及其标记
idexs = tl.utils.get_random_int(min=0, max=n_data-1, number=batch_size)
b_im_path = [imgs_file_list[i] for i in idexs]
b_images = tl.prepro.threading_data(b_im_path, fn=tl.vis.read_image)
b_ann = [ann_list[i] for i in idexs]

# 多线程处理
data = tl.prepro.threading_data([_ for _ in zip(b_images, b_ann)],
              _data_pre_aug_fn)
b_images2 = [d[0] for d in data]
b_ann = [d[1] for d in data]

# 保存每一组图片以供体会
for i in range(len(b_images)):
    tl.vis.draw_boxes_and_labels_to_image(b_images[i],
             ann_list[idexs[i]][0], ann_list[idexs[i]][1], [],
             classes, True, save_name='_bbox_vis_%d_original.png' % i)
    tl.vis.draw_boxes_and_labels_to_image((b_images2[i]+1)*127.5,
             b_ann[i][0], b_ann[i][1], [], classes, True,
             save_name='_bbox_vis_%d.png' % i)

坐标-像素单位到比例单位

tensorlayer.prepro.obj_box_coord_rescale(coord=None, shape=None)[源代码]

Scale down one coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1]. It is the reverse process of obj_box_coord_scale_to_pixelunit.

参数
  • coords (list of 4 int or None) -- One coordinates of one image e.g. [x, y, w, h].

  • shape (list of 2 int or None) -- For [height, width].

返回

New bounding box.

返回类型

list of 4 numbers

实际案例

>>> coord = tl.prepro.obj_box_coord_rescale(coord=[30, 40, 50, 50], shape=[100, 100])
  [0.3, 0.4, 0.5, 0.5]

坐标-像素单位到比例单位 (多个坐标)

tensorlayer.prepro.obj_box_coords_rescale(coords=None, shape=None)[源代码]

Scale down a list of coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].

参数
  • coords (list of list of 4 ints or None) -- For coordinates of more than one images .e.g.[[x, y, w, h], [x, y, w, h], ...].

  • shape (list of 2 int or None) -- 【height, width].

返回

A list of new bounding boxes.

返回类型

list of list of 4 numbers

实际案例

>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50], [10, 10, 20, 20]], shape=[100, 100])
>>> print(coords)
  [[0.3, 0.4, 0.5, 0.5], [0.1, 0.1, 0.2, 0.2]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[50, 100])
>>> print(coords)
  [[0.3, 0.8, 0.5, 1.0]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[100, 200])
>>> print(coords)
  [[0.15, 0.4, 0.25, 0.5]]
返回

New coordinates.

返回类型

list of 4 numbers

坐标-比例单位到像素单位

tensorlayer.prepro.obj_box_coord_scale_to_pixelunit(coord, shape=None)[源代码]

Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format. It is the reverse process of obj_box_coord_rescale.

参数
  • coord (list of 4 float) -- One coordinate of one image [x, y, w (or x2), h (or y2)] in ratio format, i.e value range [0~1].

  • shape (tuple of 2 or None) -- For [height, width].

返回

New bounding box.

返回类型

list of 4 numbers

实际案例

>>> x, y, x2, y2 = tl.prepro.obj_box_coord_scale_to_pixelunit([0.2, 0.3, 0.5, 0.7], shape=(100, 200, 3))
  [40, 30, 100, 70]

坐标-[x_center, x_center, w, h]到左上-右下单位

tensorlayer.prepro.obj_box_coord_centroid_to_upleft_butright(coord, to_int=False)[源代码]

Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.

参数
  • coord (list of 4 int/float) -- One coordinate.

  • to_int (boolean) -- Whether to convert output as integer.

返回

New bounding box.

返回类型

list of 4 numbers

实际案例

>>> coord = obj_box_coord_centroid_to_upleft_butright([30, 40, 20, 20])
  [20, 30, 40, 50]

坐标-左上-右下单位到[x_center, x_center, w, h]

tensorlayer.prepro.obj_box_coord_upleft_butright_to_centroid(coord)[源代码]

Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h]. It is the reverse process of obj_box_coord_centroid_to_upleft_butright.

参数

coord (list of 4 int/float) -- One coordinate.

返回

New bounding box.

返回类型

list of 4 numbers

坐标-[x_center, x_center, w, h]到左上-高宽单位

tensorlayer.prepro.obj_box_coord_centroid_to_upleft(coord)[源代码]

Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h]. It is the reverse process of obj_box_coord_upleft_to_centroid.

参数

coord (list of 4 int/float) -- One coordinate.

返回

New bounding box.

返回类型

list of 4 numbers

坐标-左上-高宽单位到[x_center, x_center, w, h]

tensorlayer.prepro.obj_box_coord_upleft_to_centroid(coord)[源代码]

Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h]. It is the reverse process of obj_box_coord_centroid_to_upleft.

参数

coord (list of 4 int/float) -- One coordinate.

返回

New bounding box.

返回类型

list of 4 numbers

Darknet格式-字符转列表

tensorlayer.prepro.parse_darknet_ann_str_to_list(annotations)[源代码]

Input string format of class, x, y, w, h, return list of list format.

参数

annotations (str) -- The annotations in darkent format "class, x, y, w, h ...." seperated by "\n".

返回

List of bounding box.

返回类型

list of list of 4 numbers

Darknet格式-分开列表的类别和坐标

tensorlayer.prepro.parse_darknet_ann_list_to_cls_box(annotations)[源代码]

Parse darknet annotation format into two lists for class and bounding box.

Input list of [[class, x, y, w, h], ...], return two list of [class ...] and [[x, y, w, h], ...].

参数

annotations (list of list) -- A list of class and bounding boxes of images e.g. [[class, x, y, w, h], ...]

返回

  • list of int -- List of class labels.

  • list of list of 4 numbers -- List of bounding box.

图像-翻转

tensorlayer.prepro.obj_box_horizontal_flip(im, coords=None, is_rescale=False, is_center=False, is_random=False)[源代码]

Left-right flip the image and coordinates for object detection.

参数
  • im (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • coords (list of list of 4 int/float or None) -- Coordinates [[x, y, w, h], [x, y, w, h], ...].

  • is_rescale (boolean) -- Set to True, if the input coordinates are rescaled to [0, 1]. Default is False.

  • is_center (boolean) -- Set to True, if the x and y of coordinates are the centroid (i.e. darknet format). Default is False.

  • is_random (boolean) -- If True, randomly flip. Default is False.

返回

  • numpy.array -- A processed image

  • list of list of 4 numbers -- A list of new bounding boxes.

实际案例

>>> im = np.zeros([80, 100])    # as an image with shape width=100, height=80
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3], [0.1, 0.5, 0.2, 0.3]], is_rescale=True, is_center=True, is_random=False)
>>> print(coords)
  [[0.8, 0.4, 0.3, 0.3], [0.9, 0.5, 0.2, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3]], is_rescale=True, is_center=False, is_random=False)
>>> print(coords)
  [[0.5, 0.4, 0.3, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=True, is_random=False)
>>> print(coords)
  [[80, 40, 30, 30]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=False, is_random=False)
>>> print(coords)
  [[50, 40, 30, 30]]

图像-调整大小

tensorlayer.prepro.obj_box_imresize(im, coords=None, size=None, interp='bicubic', mode=None, is_rescale=False)[源代码]

Resize an image, and compute the new bounding box coordinates.

参数
  • im (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • coords (list of list of 4 int/float or None) -- Coordinates [[x, y, w, h], [x, y, w, h], ...]

  • interp and mode (size) -- See tl.prepro.imresize.

  • is_rescale (boolean) -- Set to True, if the input coordinates are rescaled to [0, 1], then return the original coordinates. Default is False.

返回

  • numpy.array -- A processed image

  • list of list of 4 numbers -- A list of new bounding boxes.

实际案例

>>> im = np.zeros([80, 100, 3])    # as an image with shape width=100, height=80
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30], [10, 20, 20, 20]], size=[160, 200], is_rescale=False)
>>> print(coords)
  [[40, 80, 60, 60], [20, 40, 40, 40]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[40, 100], is_rescale=False)
>>> print(coords)
  [[20, 20, 30, 15]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[60, 150], is_rescale=False)
>>> print(coords)
  [[30, 30, 45, 22]]
>>> im2, coords = obj_box_imresize(im, coords=[[0.2, 0.4, 0.3, 0.3]], size=[160, 200], is_rescale=True)
>>> print(coords, im2.shape)
  [[0.2, 0.4, 0.3, 0.3]] (160, 200, 3)

图像-裁剪

tensorlayer.prepro.obj_box_crop(im, classes=None, coords=None, wrg=100, hrg=100, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代码]

Randomly or centrally crop an image, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.

参数
  • im (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • classes (list of int or None) -- Class IDs.

  • coords (list of list of 4 int/float or None) -- Coordinates [[x, y, w, h], [x, y, w, h], ...]

  • hrg and is_random (wrg) -- See tl.prepro.crop.

  • is_rescale (boolean) -- Set to True, if the input coordinates are rescaled to [0, 1]. Default is False.

  • is_center (boolean, default False) -- Set to True, if the x and y of coordinates are the centroid (i.e. darknet format). Default is False.

  • thresh_wh (float) -- Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

  • thresh_wh2 (float) -- Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

返回

  • numpy.array -- A processed image

  • list of int -- A list of classes

  • list of list of 4 numbers -- A list of new bounding boxes.

图像-位移

tensorlayer.prepro.obj_box_shift(im, classes=None, coords=None, wrg=0.1, hrg=0.1, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代码]

Shift an image randomly or non-randomly, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.

参数
  • im (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • classes (list of int or None) -- Class IDs.

  • coords (list of list of 4 int/float or None) -- Coordinates [[x, y, w, h], [x, y, w, h], ...]

  • hrg row_index col_index channel_index is_random fill_mode cval and order (wrg,) --

  • is_rescale (boolean) -- Set to True, if the input coordinates are rescaled to [0, 1]. Default is False.

  • is_center (boolean) -- Set to True, if the x and y of coordinates are the centroid (i.e. darknet format). Default is False.

  • thresh_wh (float) -- Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

  • thresh_wh2 (float) -- Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

返回

  • numpy.array -- A processed image

  • list of int -- A list of classes

  • list of list of 4 numbers -- A list of new bounding boxes.

图像-缩放

tensorlayer.prepro.obj_box_zoom(im, classes=None, coords=None, zoom_range=(0.9, 1.1), row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代码]

Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.

参数
  • im (numpy.array) -- An image with dimension of [row, col, channel] (default).

  • classes (list of int or None) -- Class IDs.

  • coords (list of list of 4 int/float or None) -- Coordinates [[x, y, w, h], [x, y, w, h], ...].

  • row_index col_index channel_index is_random fill_mode cval and order (zoom_range) --

  • is_rescale (boolean) -- Set to True, if the input coordinates are rescaled to [0, 1]. Default is False.

  • is_center (boolean) -- Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format). Default is False.

  • thresh_wh (float) -- Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

  • thresh_wh2 (float) -- Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

返回

  • numpy.array -- A processed image

  • list of int -- A list of classes

  • list of list of 4 numbers -- A list of new bounding boxes.

特征点

图像-裁剪

tensorlayer.prepro.keypoint_random_crop(image, annos, mask=None, size=(368, 368))[源代码]

Randomly crop an image and corresponding keypoints without influence scales, given by keypoint_random_resize_shortestedge.

参数
  • image (3 channel image) -- The given image for augmentation.

  • annos (list of list of floats) -- The keypoints annotation of people.

  • mask (single channel image or None) -- The mask if available.

  • size (tuple of int) -- The size of returned image.

返回

返回类型

preprocessed image, annotation, mask

图像-旋转

tensorlayer.prepro.keypoint_random_rotate(image, annos, mask=None, rg=15.0)[源代码]

Rotate an image and corresponding keypoints.

参数
  • image (3 channel image) -- The given image for augmentation.

  • annos (list of list of floats) -- The keypoints annotation of people.

  • mask (single channel image or None) -- The mask if available.

  • rg (int or float) -- Degree to rotate, usually 0 ~ 180.

返回

返回类型

preprocessed image, annos, mask

图像-翻转

tensorlayer.prepro.keypoint_random_flip(image, annos, mask=None, prob=0.5, flip_list=(0, 1, 5, 6, 7, 2, 3, 4, 11, 12, 13, 8, 9, 10, 15, 14, 17, 16, 18))[源代码]

Flip an image and corresponding keypoints.

参数
  • image (3 channel image) -- The given image for augmentation.

  • annos (list of list of floats) -- The keypoints annotation of people.

  • mask (single channel image or None) -- The mask if available.

  • prob (float, 0 to 1) -- The probability to flip the image, if 1, always flip the image.

  • flip_list (tuple of int) -- Denotes how the keypoints number be changed after flipping which is required for pose estimation task. The left and right body should be maintained rather than switch. (Default COCO format). Set to an empty tuple if you don't need to maintain left and right information.

返回

返回类型

preprocessed image, annos, mask

图像-缩放

tensorlayer.prepro.keypoint_random_resize(image, annos, mask=None, zoom_range=(0.8, 1.2))[源代码]

Randomly resize an image and corresponding keypoints. The height and width of image will be changed independently, so the scale will be changed.

参数
  • image (3 channel image) -- The given image for augmentation.

  • annos (list of list of floats) -- The keypoints annotation of people.

  • mask (single channel image or None) -- The mask if available.

  • zoom_range (tuple of two floats) -- The minimum and maximum factor to zoom in or out, e.g (0.5, 1) means zoom out 1~2 times.

返回

返回类型

preprocessed image, annos, mask

图像-缩放 最短边

tensorlayer.prepro.keypoint_random_resize_shortestedge(image, annos, mask=None, min_size=(368, 368), zoom_range=(0.8, 1.2), pad_val=(0, 0, numpy.random.uniform))[源代码]

Randomly resize an image and corresponding keypoints based on shorter edgeself. If the resized image is smaller than min_size, uses padding to make shape matchs min_size. The height and width of image will be changed together, the scale would not be changed.

参数
  • image (3 channel image) -- The given image for augmentation.

  • annos (list of list of floats) -- The keypoints annotation of people.

  • mask (single channel image or None) -- The mask if available.

  • min_size (tuple of two int) -- The minimum size of height and width.

  • zoom_range (tuple of two floats) -- The minimum and maximum factor to zoom in or out, e.g (0.5, 1) means zoom out 1~2 times.

  • pad_val (int/float, or tuple of int or random function) -- The three padding values for RGB channels respectively.

返回

返回类型

preprocessed image, annos, mask

序列

更多相关函数,请见 tensorlayer.nlp

Padding

tensorlayer.prepro.pad_sequences(sequences, maxlen=None, dtype='int32', padding='post', truncating='pre', value=0.0)[源代码]

Pads each sequence to the same length: the length of the longest sequence. If maxlen is provided, any sequence longer than maxlen is truncated to maxlen. Truncation happens off either the beginning (default) or the end of the sequence. Supports post-padding and pre-padding (default).

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • maxlen (int) -- Maximum length.

  • dtype (numpy.dtype or str) -- Data type to cast the resulting sequence.

  • padding (str) -- Either 'pre' or 'post', pad either before or after each sequence.

  • truncating (str) -- Either 'pre' or 'post', remove values from sequences larger than maxlen either in the beginning or in the end of the sequence

  • value (float) -- Value to pad the sequences to the desired value.

返回

x -- With dimensions (number_of_sequences, maxlen)

返回类型

numpy.array

实际案例

>>> sequences = [[1,1,1,1,1],[2,2,2],[3,3]]
>>> sequences = pad_sequences(sequences, maxlen=None, dtype='int32',
...                  padding='post', truncating='pre', value=0.)
[[1 1 1 1 1]
 [2 2 2 0 0]
 [3 3 0 0 0]]

Remove Padding

tensorlayer.prepro.remove_pad_sequences(sequences, pad_id=0)[源代码]

Remove padding.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • pad_id (int) -- The pad ID.

返回

The processed sequences.

返回类型

list of list of int

实际案例

>>> sequences = [[2,3,4,0,0], [5,1,2,3,4,0,0,0], [4,5,0,2,4,0,0,0]]
>>> print(remove_pad_sequences(sequences, pad_id=0))
[[2, 3, 4], [5, 1, 2, 3, 4], [4, 5, 0, 2, 4]]

Process

tensorlayer.prepro.process_sequences(sequences, end_id=0, pad_val=0, is_shorten=True, remain_end_id=False)[源代码]

Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • end_id (int) -- The special token for END.

  • pad_val (int) -- Replace the end_id and the IDs after end_id to this value.

  • is_shorten (boolean) -- Shorten the sequences. Default is True.

  • remain_end_id (boolean) -- Keep an end_id in the end. Default is False.

返回

The processed sequences.

返回类型

list of list of int

实际案例

>>> sentences_ids = [[4, 3, 5, 3, 2, 2, 2, 2],  <-- end_id is 2
...                  [5, 3, 9, 4, 9, 2, 2, 3]]  <-- end_id is 2
>>> sentences_ids = precess_sequences(sentences_ids, end_id=vocab.end_id, pad_val=0, is_shorten=True)
[[4, 3, 5, 3, 0], [5, 3, 9, 4, 9]]

Add Start ID

tensorlayer.prepro.sequences_add_start_id(sequences, start_id=0, remove_last=False)[源代码]

Add special start token(id) in the beginning of each sequence.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • start_id (int) -- The start ID.

  • remove_last (boolean) -- Remove the last value of each sequences. Usually be used for removing the end ID.

返回

The processed sequences.

返回类型

list of list of int

实际案例

>>> sentences_ids = [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]]
>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2)
[[2, 4, 3, 5, 3, 2, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2, 3]]
>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2, remove_last=True)
[[2, 4, 3, 5, 3, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2]]

For Seq2seq

>>> input = [a, b, c]
>>> target = [x, y, z]
>>> decode_seq = [start_id, a, b] <-- sequences_add_start_id(input, start_id, True)

Add End ID

tensorlayer.prepro.sequences_add_end_id(sequences, end_id=888)[源代码]

Add special end token(id) in the end of each sequence.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • end_id (int) -- The end ID.

返回

The processed sequences.

返回类型

list of list of int

实际案例

>>> sequences = [[1,2,3],[4,5,6,7]]
>>> print(sequences_add_end_id(sequences, end_id=999))
[[1, 2, 3, 999], [4, 5, 6, 999]]

Add End ID after pad

tensorlayer.prepro.sequences_add_end_id_after_pad(sequences, end_id=888, pad_id=0)[源代码]

Add special end token(id) in the end of each sequence.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • end_id (int) -- The end ID.

  • pad_id (int) -- The pad ID.

返回

The processed sequences.

返回类型

list of list of int

实际案例

>>> sequences = [[1,2,0,0], [1,2,3,0], [1,2,3,4]]
>>> print(sequences_add_end_id_after_pad(sequences, end_id=99, pad_id=0))
[[1, 2, 99, 0], [1, 2, 3, 99], [1, 2, 3, 4]]

Get Mask

tensorlayer.prepro.sequences_get_mask(sequences, pad_val=0)[源代码]

Return mask for sequences.

参数
  • sequences (list of list of int) -- All sequences where each row is a sequence.

  • pad_val (int) -- The pad value.

返回

The mask.

返回类型

list of list of int

实际案例

>>> sentences_ids = [[4, 0, 5, 3, 0, 0],
...                  [5, 3, 9, 4, 9, 0]]
>>> mask = sequences_get_mask(sentences_ids, pad_val=0)
[[1 1 1 1 0 0]
 [1 1 1 1 1 0]]