Pattern mining
This package provides the functionality to mining sequential patterns in a symbolic representation of the time series. The pattern mining algorithms can be imported as follows:
>>> from patsemb import pattern_mining
All mining algorithms inherit from the PatternMiner
class, which can be used to mine sequential patterns via the mine()
function.
- class patsemb.pattern_mining.PatternMiner[source]
Mine patterns in a discrete representation of the time series.
- abstract mine(discrete_sequences: ndarray, y=None) List[array][source]
Fit this discretizer for the given (collection of) time series.
- Parameters:
discrete_sequences (np.array of shape (n_symbolic_sequences, length_symbolic_sequences)) – The discrete representation of a time series. This representation consists of ´n_symbolic_sequences´ subsequences, each one having ´length_symbolic_sequences´ symbols. The sequences are provided as the rows of the given input matrix.
y (Ignored) – Not used, present here for API consistency by convention.
- Returns:
self – The list of mined patterns.
- Return type:
List[np.array]
Frequent sequential patterns
Frequent, sequential patterns are sequential patterns that often occur in the set of symbolic words. This ensures that the mined patterns are correspond to typical shapes of the time series. Extensive research in frequent pattern mining resulted in numerous pattern-mining algorithms. We rely on the SPMF-library [FLGG16] to mine frequent sequential patterns.
Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016). The SPMF Open-Source Data Mining Library Version 2. Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III. https://doi.org/10.1007/978-3-319-46131-1_8