Git Product home page Git Product logo

smp's Introduction

Smart Meter Project

Several tasks related to smart meters project, including

  • access data from remote database
  • visualize statistics load data as figures
  • train anomaly detection model for each resident
  • show the reconstruction error between model result and input data

Directory structure

folder

Dependency

Package Version
python 3.9.7
pytorch 1.10.2
cudatoolkit 11.3.1
numpy 1.21.2
pandas 1.4.1
matplotlib 3.5.1
tensorboard 2.8.0
requests 2.27.1
xlrd 2.0.1

If you are using conda to manage package and in linux, run

source check.sh

to check current dependency package and version.
Or run

conda list | findstr /rc:'^python ' /rc:'pytorch ' /rc:'cudatoolkit' /rc:'numpy ' /rc:'pandas' /rc:'matplotlib ' /rc:'tensorboard ' /rc:'requests ' /rc:'xlrd'

in windows cmd.
If the specific package is not show in return list, that means package haven't been installed yet, or not in this environment but in base.
dependency

Tasks

There are several functions in main.py, each function represent a corresponding task.
To do the certain task, uncomment the related line in if __name__=='__main__:' block and comment other functions, then run the main.py by

python main.py
  • access data
    use debug or interactive mode would be better, since need to update several times when network is unstable.

    1. use ssh connect to specified host
    2. run openproxyserver.bat to open tunnel as proxy server
    3. make sure 110resident.xls and connection.json in access folder
    4. run access_api_data function in main.py
      expected result would be like:
      access
  • visualize statistics load data

    1. make sure there are data_xxx.csv and info.csv in dataset/TMbase folder
    2. adjust parameters of plot_load_analysis function in main script
    3. run main.py
      example result, figure-2021 of --show ymwh flag 2021_ymwh
  • train reconstruction model
    We modify nbeats model to output backcast and fit the re-construction based anomaly detection task. And we use CNN structure with down-sampling factor to control relation between input and representation instead of backcast_length and forecast_length. So the origin fully-connected structure nbeats model may not work fine and should be deprecate.

    1. make sure there are data_xxx.csv in dataset/TMbase folder
    2. adjust parameters of train_model function in main script
      make_argv actually return 2-layer list, outside is for each resident from cond1 and inside is arguments corresponding to the resident to train the model.
      • choose GPU device index
      • log context_visualization may consume large space
    3. run main.py
    • Information in training process is logged in exp/{expname}/log/{name}.csv
    • the model weight is saved in exp/{expname}/model/{name}.mdl
    • To visualize the information, run tensorboard by
      tensorboard --logdir=exp/{expname}/run [--port]
      
      {expname} need to be replace to experimental name, and specify port number if the default port is conflict
  • detection

    1. make sure there are data_xxx.csv in dataset/TMbase folder and model weight under exp/{expname}/model
    2. adjust parameters in each task:
      • compute anomaly ratio by given threshold list run detect_compute_ratio function
      • apply other residents' data on each model run detect_apply_on_other_data function
      • show reconstruction result and error run detect_user_period function
      • parameter output_place of function detect_compute_ratio and detect_apply_on_other_data can adjust figure output place,
        None for directly ouput to window or plot panel in editor, or given a string {str} as path runs/detect/{str} to log in tensorboard.
  • other minor tasks

    • plot_user_data
    • plot_model_basis
    • plot_model_result

References

[1] nbeats paper
[2] nbeats source code
[3] IHEPC dataset

smp's People

Contributors

egositinr avatar pha123661 avatar

Watchers

 avatar

smp's Issues

[Improvement] hard code to json

由於沒有遇到資料變動或應用在其他社區的情況
api.py內雖然有將重要的資料如ip或住戶資訊做分離
但仍有一些地址轉換規則相關hard code

這部分的轉換規則或許也用json的方式額外存儲讀取
對於後續若有使用情境上變化更動以及隱私保護會更好

[Future works] some ideas

一些老師的意見與沒完成的想法

  • share basis by 邱老師
    模型中部分basis由全部用戶共享
    比如第1個stack為全部共用,第2個stack只用各自資料訓練
  • 不同down sampling rate組合/不同週期basis
    將短周期資料趨勢與長周期資料趨勢拆解
    比如部分basis用3,4個小時作為單位,部分為1天甚或1周
    同時feature extractor的部分可以搭配設計
    3小的用receptive field3小的CNN,1周的用dilation的形式
  • 不定長度資料訓練/更多future step
    這個應該對extractor部分沒啥影響
    只是想說預測模塊的輸入如果有多種長度會不會讓模型學到更多

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.