Utils

Various utilities unrelated to trees or profiles.

sap.utils.ndarray_hash(x, l=8, c=1000)[source]

Compute a hash from a numpy array.

Parameters
  • x (ndarray) – The array to hash.

  • l (int, optional) – The length of the hash. Must be an even number.

  • c (int, optional) – A variable to affect the sampling of the hash. It has to be the same along the matching process. Refer to notes.

Returns

hash – The hash of array x.

Return type

str

Notes

Python hash is slow and will offset the random generator in each kernel. The hash of the same data will not match in different kernels.

The idea is to sparsely sample the data to speed up the hash computation. By fixing the number of samples the hash computation will take a fixed amount of time, no matter the size of the data.

This hash function output a hash of \(x\) in hexadecimal. The length of the hash is \(l\). The hashes are consistent when tuning the length \(l\): shorter hashes are contained in the longer ones for the same data \(x\). The samples count taken in \(x\) is \(\frac{l \times c}{2}\).

sap.utils.local_patch(arr, patch_size=7)[source]

Create local patches around each value of the array

Parameters
  • arr (ndarray) – The input data.

  • patch_size (int) – The size \(w\) of the patches. For a 2D nadarray the returned patch size will be \(w \times w\).

Returns

patches – The local patches. The shape of the returned array is arr.shape + (patch_size,) * arr.ndim.

Return type

ndarray

Notes

This implementation is memory efficient. The returned patches are a view of original array and are not writeable.

This function works regardless of the dimension of arr with hypercubes shaped patches, according to the dimension of arr.

See also

local_patch_f

use a function over the local patches.

sap.utils.local_patch_f(arr, patch_size=7, f=np.mean)[source]

Describe local patches around each value of the array

Parameters
  • arr (ndarray) – The input data.

  • patch_size (int) – The size \(w\) of the patches.

  • f (function) – The function to run over the local patches. For now it is necessary to use a function with axis parameter such as np.mean, np.std, etc… See more functions on Numpy documentation.

Returns

patches – The description of the local patches. The shape of the returned array is arr.shape.

Return type

ndarray

Notes

Refer to local_patch() for full documentation.

See also

local_patch

create the local patches.