Git Product home page Git Product logo

sensorqualityevaluator's Introduction

SensorQualityEvaluator

Problem Statement

Process the log file and automate the quality control evaluation based on pre-defined criteria in a controlled environment


Design Changes made in consideration of Extendablity, Maintainablity and Distributed Modeling

  • The interface method name changed from EvaluateLogFile -> EvaluateLogFileContents
  • The following changes made to the format of the log file
reference temp:70.0 hum:45.0 mon:6
temp temp-1 2007-04-05T22:00 72.4
temp temp-1 2007-04-05T22:01 76.0
temp temp-1 2007-04-05T22:02 79.1
temp temp-1 2007-04-05T22:03 75.6
hum hum-1 2007-04-05T22:05 45.3
hum hum-1 2007-04-05T22:06 45.1
temp temp-1 2007-04-05T22:04 71.2
temp temp-2 2007-04-05T22:02 70.1
temp temp-2 2007-04-05T22:03 71.3
temp temp-1 2007-04-05T22:05 71.4
temp temp-1 2007-04-05T22:06 69.2
temp temp-1 2007-04-05T22:07 65.2
mon mon-2 2007-04-05T22:07 8
temp temp-1 2007-04-05T22:08 62.8
temp temp-1 2007-04-05T22:09 61.4
mon mon-2 2007-04-05T22:08 6
temp temp-1 2007-04-05T22:10 64.0
temp temp-1 2007-04-05T22:11 67.5
mon mon-1 2007-04-05T22:05 7
mon mon-1 2007-04-05T22:06 9
mon mon-2 2007-04-05T22:04 2
temp temp-1 2007-04-05T22:12 69.4
temp temp-2 2007-04-05T22:01 69.5
temp temp-2 2007-04-05T22:05 69.8
hum hum-1 2007-04-05T22:04 45.2
hum hum-2 2007-04-05T22:04 44.4
hum hum-2 2007-04-05T22:07 43.8
hum hum-2 2007-04-05T22:08 42.1
mon mon-1 2007-04-05T22:04 5
mon mon-2 2007-04-05T22:05 4
mon mon-2 2007-04-05T22:06 10
temp temp-2 2007-04-05T22:04 71.5
hum hum-2 2007-04-05T22:05 43.9
hum hum-2 2007-04-05T22:06 44.9
  1. Each line has complete information about the log it's describing and information required for processing a single log statement is not scattered in different parts of the file
  2. Since each line has complete information, the order does not matter. Log streaming from multiple distributed systems can easily flow into 1 place.
  3. Reference line changed to specify which reference value belongs to which sensor type [sensorType: refValue]. For eg:
    reference temp:70.0 hum:45.0 mon:6 smoke:10.2 air_quality:5.1
    
    • Adding a brand new sensorType: refvalue is pretty intutive and straight forward.
    • Less error in sensorType reference values being represented just by the virtue of order. In the older way, change of order of reference values acn be disastrous.
  4. Don't have to worry about any packet losses or missing log lines to cause completely wrong evaluations.
    • In the sample input provided in the problem statement, if one of the lines [<type> > <name>] goes missing, the entire evaluation for the sensors would go wrong

IMP:

Since there is no importance given to order, the output printed would have the correct JSON format and evalution results but WOULD NOT follow any particular order


3rd Party Components Used (Why reinvent the wheel? :) )

  • MathNet.Numerics nuget package (https://numerics.mathdotnet.com/)
  • Extremely fast processing, and better memory performance string extension SplitLines()
    • The inbuilt string.Split() method uses pattern matching using regex which makes is slow for the specific scenario we have here where we just want to split into separate lines (pre-known separator '\r' '\n'). string.Split() also leads to a lot of allocations when the result is used in a foreach loop. Using the new ReadOnlySpan, we can avoid allocations and make the operation much faster by providing a enumerator.
      Ref: https://www.meziantou.net/split-a-string-into-lines-without-allocation.htm
    • The enumeration on the result of SplitLines() is a ReadOnlySpan<char> which can be obtained by lazy evaluation as per need. So no pre-allocation of memory.

Assumptions:

  • The first line of the logContents is always the reference line
  • There is only 1 reference line per sample input
  • During calculations 'within' is inclusive of the threshold value

Areas of improvement

  • The threshold values for fault tolerance, standard deviation etc. can be moved to a config file and read from there instaed of hardcoded values
  • The performance for evaluating the "Humidistats" and "CO Detectors" can be further improved. Right now all the readings of a particular humidistat or CO detector are being read into memory and then evaluated. But that can be avoided by doing inline evalution for those readings as they are being read and stop processing the readings as soon as a deviation from threshold is encountered.
  • More Validations and Error Checking.

sensorqualityevaluator's People

Contributors

rdasan avatar rejik2 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.