Git Product home page Git Product logo

petrobras / bibmon Goto Github PK

View Code? Open in Web Editor NEW
38.0 38.0 19.0 12.16 MB

Python package that provides predictive models for fault detection, soft sensing, and process condition monitoring.

Home Page: https://bibmon.readthedocs.io/

License: Apache License 2.0

Python 63.16% Fortran 36.84%
echo-state-network fault-detection fault-diagnosis machine-learning neural-networks pca process-monitoring process-systems-engineering scikit-learn soft-sensor time-series

bibmon's People

Contributors

afraniomelo avatar camaramm avatar deriss avatar tsmlemos avatar yelken avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

bibmon's Issues

Add or test new regression models

How we are today

BibMon currently supports the following regression models:

  • Echo State Networks (ESN);
  • all regression models that use the scikit-learn interface.

The second item significantly broadens BibMon's capabilities by enabling the use of a wide array of methods already implemented in Python that conform to the scikit-learn interface.

Proposed enhancement

New, innovative regression models can be seamlessly integrated by inheriting from the GenericModel class. For more instructions, please refer to the contributing guide.

If the model's API adheres to the scikit-learn template, it can be directly utilized through the sklearnRegressor class, as detailed in this tutorial. Contributors are encouraged to create Jupyter notebooks that demonstrate the use of these regressors. The notebooks can be incorporated into BibMon's documentation as new tutorials. Some suggested models include:

Add new preprocessing techniques

How we are today

Data preprocessing is handled by the PreProcess class, which contains various preprocessing methods such as normalization, removal of NaN observations, and more. These self-contained methods are responsible for preparing raw data to be fed into training models.

Proposed enhancement

BibMon is open to incorporating any preprocessing method that could be beneficial for sequential data. Contributors are encouraged to consult specific bibliographies on the topic to identify valuable techniques, such as https://doi.org/10.1016/j.jer.2024.02.018 or https://doi.org/10.1515/revce-2015-0022.

Implementation

To introduce new preprocessing techniques, developers should implement them as methods of the PreProcess class, adhering to the guidelines provided in the contributing guide.

Add support for classification-based models

How we are today

Currently, BibMon uses a deviation-based methodology, which focuses on monitoring deviations from expected values or patterns. In this framework, algorithms are designed to compare actual sensor measurements with model representations to identify process anomalies. A metric called SPE (Squared Prediction Error) is used to compute the deviation between actual and expected values. SPE values are computed from regression or reconstruction models. For more details, please check https://doi.org/10.1016/j.dche.2024.100182.

Proposed new feature

Some applications would benefit more from classification models rather than regression or reconstruction models. In this scenario, the model would analyze sample data and attribute a probability indicating the likelihood that the data corresponds to a faulty event. This probability would be analogous to the SPE currently used in the package.

Implementation

Implementing this new feature may significantly impact the package structure, as the main class, GenericModel, is entirely based on the use of SPE to create control charts. One approach could be to create a new class called GenericModelProb, which would be analogous to the existing GenericModel. This new class could generate control charts based on probabilities instead of SPE values.

BUG: condição `if` incompleta na inicialização da variável `install_requires`

Se a condição if a seguir não for satisfeita:

BibMon/setup.py

Lines 17 to 19 in 7f2202b

if os.path.isfile(requirements):
with open(requirements) as f:
install_requires = f.read().splitlines()

a variável install_requires não será inicializada e causará exceção na linha:
install_requires=install_requires

Uma possibilidade de solução seria incluir o else para inicializar a variável install_requires para um valor padrão.

Add support for more datasets

How we are today

Currently, BibMon includes two datasets packaged with it: the Tennessee Eastman Process dataset and an anonymized dataset containing real process data. These datasets can be easily imported from within the package.

Proposed enhancement

We propose expanding BibMon's applicability by testing it with additional process datasets. Below are some suggestions:

For more details on some of these datasets, please refer to the article: https://doi.org/10.1016/j.compchemeng.2022.107964.

Do you know of other datasets where BibMon could be applied? Please share them with us here!!

Implementation

A potential enhancement is to facilitate the integration of these new datasets for BibMon users. This can be achieved by either packaging the datasets within the distribution (if their size permits) or by providing mechanisms for users to download them while utilizing the library.

By incorporating these additional datasets, we aim to broaden BibMon's testing scope and improve its robustness across various industrial applications.

Add new interpretability techniques

How we are today

To improve interaction and adoption by users and operators at process industries, it is important that models are interpretable. At present, the interpretability functionalities provided by BibMon are limited to the sklearnRegressor class and rely solely on feature importances.

Proposed enhancement

We propose the implementation of advanced interpretability techniques such as LIME (local interpretable model-agnostic explanations) (Ribeiro et al., 2016) and SHAP (Shapley additive explanations) (Lundberg and Lee, 2017).

Implementation

Ideally, these functionalities should be implemented in files such as _generic_model.py or _bibmon_tools.py. This approach will ensure that the new interpretability techniques are accessible for all models within the library.

Add advanced reconstruction models

How we are today

Reconstruction models aim to represent normal process behavior by utilizing a mathematical structure that incorporates all relevant variables simultaneously. The objective is typically to create a compressed representation of the data that still captures the essential information, often involving dimensionality reduction methods. During the test phase, the data is compared to the identified structure to detect any deviations. For more details, please check MELO et al. (2024) [1].

Currently, the following reconstruction models are implemented in BibMon:

  • Principal Component Analysis (PCA);
  • Autoencoders (AE);
  • Similarity-based method (SBM).

Proposed enhancement

There is extensive literature on reconstruction methodologies for multivariate statistical process monitoring. Some examples include:

  • Advanced versions of PCA:
    • Multi-Way PCA (mPCA) [2], [3] for batch processes;
    • Dynamic-Inner PCA (DiPCA) [4], [5] or Recursive PCA (RPCA) [6] for dynamic processes;
    • Multiscale PCA (MSPCA) [7] for analyzing multiple time scales;
    • Kernel PCA (kPCA) [8] for nonlinear processes;
  • Partial Least Squares (PLS) [9], [10];
  • Canonical Correlation Analysis (CCA) [11], [10];
  • Fisher Discriminant Analysis (FDA) [12];
  • Independent Component Analysis (ICA) [13], [14];
  • Slow Feature Analysis (SFA) [15], [16].

For a comprehensive review and analysis of these techniques, please refer to MELO et al. (2024) [17].

Implementation

New reconstruction models can be implemented by inheriting from GenericModel class. For more instructions, please refer to the contributing guide.

References

[1] Melo, A., Lemos, T. S., Soares, R. M., Spina, D., Clavijo, N., Campos, L. F. D. O., ... & Pinto, J. C. (2024). BibMon: An open source Python package for process monitoring, soft sensing, and fault diagnosis. Digital Chemical Engineering, 100182.

[2] Nomikos, P., & MacGregor, J. F. (1994). Monitoring batch processes using multiway principal component analysis. AIChE Journal, 40(8), 1361-1375.

[3] Rendall, R., Chiang, L. H., & Reis, M. S. (2019). Data-driven methods for batch data analysis–A critical overview and mapping on the complexity scale. Computers & Chemical Engineering, 124, 1-13.

[4] Dong, Y., & Qin, S. J. (2018). A novel dynamic PCA algorithm for dynamic data modeling and process monitoring. Journal of Process Control, 67, 1-11.

[5] Dong, Y., & Qin, S. J. (2018). Dynamic latent variable analytics for process operations and control. Computers & Chemical Engineering, 114, 69-80.

[6] Li, W., Yue, H. H., Valle-Cervantes, S., & Qin, S. J. (2000). Recursive PCA for adaptive process monitoring. Journal of Process Control, 10(5), 471-486.

[7] Bakshi, B. R. (1998). Multiscale PCA with application to multivariate statistical process monitoring. AIChE Journal, 44(7), 1596-1610.

[8] Lee, J. M., Yoo, C., Choi, S. W., Vanrolleghem, P. A., & Lee, I. B. (2004). Nonlinear process monitoring using kernel principal component analysis. Chemical Engineering Science, 59(1), 223-234.

[9] Rosipal, R., & Krämer, N. (2005, February). Overview and recent advances in partial least squares. In International Statistical and Optimization Perspectives Workshop" Subspace, Latent Structure and Feature Selection" (pp. 34-51). Berlin, Heidelberg: Springer Berlin Heidelberg.

[10] Zhang, K., Peng, K., Chu, R., & Dong, J. (2018). Implementing multivariate statistics-based process monitoring: A comparison of basic data modeling approaches. Neurocomputing, 290, 172-184.

[11] Yang, X., Liu, W., Liu, W., & Tao, D. (2019). A survey on canonical correlation analysis. IEEE Transactions on Knowledge and Data Engineering, 33(6), 2349-2368.

[12] Chiang, L. H., Russell, E. L., & Braatz, R. D. (2000). Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemometrics and Intelligent Laboratory Systems, 50(2), 243-252.

[13] Lee, J. M., Yoo, C., & Lee, I. B. (2004). Statistical process monitoring with independent component analysis. Journal of Process Control, 14(5), 467-485.

[14] Palla, G. L. P., & Pani, A. K. (2023). Independent component analysis application for fault detection in process industries: Literature review and an application case study for fault detection in multiphase flow systems. Measurement, 209, 112504.

[15] Shang, C., Yang, F., Gao, X., Huang, X., Suykens, J. A., & Huang, D. (2015). Concurrent monitoring of operating condition deviations and process dynamics anomalies with slow feature analysis. AIChE Journal, 61(11), 3666-3682.

[16] Song, P., & Zhao, C. (2022). Slow down to go better: A survey on slow feature analysis. IEEE Transactions on Neural Networks and Learning Systems, 35(3), 3416-3436.

[17] Melo, A., Câmara, M. M., & Pinto, J. C. (2024). Data-Driven Process Monitoring and Fault Diagnosis: A Comprehensive Survey. Processes, 12(2), 251.

Add new alarm logics

How we are today

BibMon monitors a specific metric called Squared Prediction Error (SPE). Various alarm logics can be used to detect specific variations in this metric. Traditionally, an alarm is triggered whenever a new SPE value exceeds a predefined limit, identifying it as an outlier SPE point. To reduce false alarms, a count of outliers within a specified window size can be implemented. These two functionalities are currently implemented in the detecOutlier.py method from the _alarms.py file.

Proposed enhancement

We propose adding new alarm logics to enhance BibMon's monitoring capabilities. These new logics can be implemented as functions in the _alarms.py file. For more details, please refer to the contributing guide.

Examples of new alarm logics

  1. Other types of deviations: Alarms that monitor specific types of deviations, such as drift or bias.
  2. Nelson Rules: Alarms inspired on Nelson rules, which are typically applied in univariate statistical process control. Some of these rules might be useful for scenarios where BibMon is applied.

Add support for univariate monitoring

How we are today

BibMon is based on multivariate statistical process monitoring and machine learning, aiming to establish robust monitoring procedures by identifying relevant relationships among several variables.

Proposed new feature

An important branch of process monitoring deals with univariate methods, which focus on analyzing single variables independently. Examples include statistical quality control, where individual quality metrics or operational parameters are monitored to detect deviations from established control limits (MONTGOMERY, 2012), and monitoring trends or seasonality in univariate time series, which involves analyzing patterns over time to predict future behavior and identify anomalies (ALEXANDROV, 2012).

Implementation

A new BibMon module can be implemented to handle univariate time series. This module could support the creation of various types of control charts, such as Shewhart, CUSUM, and EWMA charts, and utilize univariate time series analysis tools like Facebook's Prophet.

Falha no job para submissão no repositório conda-forge

Olá @afraniomelo .

Submeti a BibMon no repositório conda-forge (PR) para que o pacote possa ser instalado e gerenciado também com o conda.

Porém, surgiu um erro nos processos de build NameError: name 'install_requires' is not defined que entendi estar relacionado a um comando no setup.py:

install_requires=install_requires

conforme você pode conferir no log de um dos jobs: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=942464&view=logs&j=6f142865-96c3-535c-b7ea-873d86b887bd&t=22b0682d-ab9e-55d7-9c79-49f3c3ba4823&l=724

Alguma suspeita sobre a possível causa desse erro?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.