Utils¶
Various utilities unrelated to trees or profiles.
- sap.utils.ndarray_hash(x, l=8, c=1000)[source]¶
Compute a hash from a numpy array.
- Parameters
x (ndarray) – The array to hash.
l (int, optional) – The length of the hash. Must be an even number.
c (int, optional) – A variable to affect the sampling of the hash. It has to be the same along the matching process. Refer to notes.
- Returns
hash – The hash of array x.
- Return type
str
Notes
Python hash is slow and will offset the random generator in each kernel. The hash of the same data will not match in different kernels.
The idea is to sparsely sample the data to speed up the hash computation. By fixing the number of samples the hash computation will take a fixed amount of time, no matter the size of the data.
This hash function output a hash of \(x\) in hexadecimal. The length of the hash is \(l\). The hashes are consistent when tuning the length \(l\): shorter hashes are contained in the longer ones for the same data \(x\). The samples count taken in \(x\) is \(\frac{l \times c}{2}\).
- sap.utils.local_patch(arr, patch_size=7)[source]¶
Create local patches around each value of the array
- Parameters
arr (ndarray) – The input data.
patch_size (int) – The size \(w\) of the patches. For a 2D nadarray the returned patch size will be \(w \times w\).
- Returns
patches – The local patches. The shape of the returned array is
arr.shape + (patch_size,) * arr.ndim
.- Return type
ndarray
Notes
This implementation is memory efficient. The returned patches are a view of original array and are not writeable.
This function works regardless of the dimension of
arr
with hypercubes shaped patches, according to the dimension ofarr
.See also
local_patch_f
use a function over the local patches.
- sap.utils.local_patch_f(arr, patch_size=7, f=np.mean)[source]¶
Describe local patches around each value of the array
- Parameters
arr (ndarray) – The input data.
patch_size (int) – The size \(w\) of the patches.
f (function) – The function to run over the local patches. For now it is necessary to use a function with
axis
parameter such asnp.mean
,np.std
, etc… See more functions on Numpy documentation.
- Returns
patches – The description of the local patches. The shape of the returned array is
arr.shape
.- Return type
ndarray
Notes
Refer to
local_patch()
for full documentation.See also
local_patch
create the local patches.