Git Product home page Git Product logo

Comments (15)

altendky avatar altendky commented on August 20, 2024 2

I'm not sure how any of that makes explicit inconsistencies good. Such as some non-strings having a length specified and others not having it or attribute orders being often consistent but sometimes not. Good file generation when it's a situation like this would involve generating the reference files with some consistency rather than explicit inconsistency. Inconsistent files would belong in a test suite explicitly designed to try to break code implementations that make inappropriate assumptions.

I hope that the JSON models end up being well formatted and consistently ordered. The standards not prescribing meaning to order doesn't mean it is good to explicitly avoid consistent order, etc.

from models.

bobfox avatar bobfox commented on August 20, 2024

In general, even though it has not been explicitly specified, length only needs to be specified for strings. I would support updating all models to follow that guideline if it is useful.

from models.

altendky avatar altendky commented on August 20, 2024

Improved consistency of multiple aspects would be very useful. Order, presence of attributes (?), I forget what else I've run into. But basically every time I try to touch the models with any automation I run into hassles and/or errors with them. Presumably the solution to this is an automated tool to process them, or at least check them and require manual correction. I'd expect a generic thing might exist to pretty up xml based on an xsd. Specifically regarding lengths it would be much better to just always have them I would think than to condition their presence on being strings. Along with that would be a verification of agreement between type, length, and offset (consider #52).

from models.

bobfox avatar bobfox commented on August 20, 2024

In general. my desire would be to move away from the xml encoded definitions and move to the json versions since the most important models going forward (7xx and others to come) are only going to be available in that form. That being said, I am not sure how you are using the information you are getting from the model definitions but it seems like for most use cases you need to take point type into account in the processing. The length field is redundant except in the case of strings so it seems like the processing would naturally support the difference. I really don't have a strong preference at this point due to the migration to json. As far as ordering of the attributes, the xml standard indicates the order of element attributes is not significant and you may get a different ordering based on the tool/xml library you are using. If I recall, the ordering you see here was supported by editing by hand to promote readability but can not be relied on if it passes through one or more xml libraries.

from models.

altendky avatar altendky commented on August 20, 2024

Sure, XML order may not matter per standard... until you actually try to compare it to another file. Then things being out of order make it much more of a hassle to compare. Likewise, having no consistency with when lengths are specified makes it difficult. (as in some different models are written differently, not string-vs-not-string) We also support a CANbus interface. We have an interface agnostic definition of our parameters and then also define our interfaces as related to them. Thus we are generating SunSpec models ourselves that match our interfaces (both standard models and our own). Our intent is to achieve actual equality of our models with the standard SunSpec ones but while in development and given the system we are using I am checking our output XML files to confirm they match the SunSpec ones. Doing the diffs is a _lot_ more work due to the inconsistency (and errors) in these models. Sure, XML is legacy now and I haven't looked over the JSON formatted files but these points apply equally to both. There must be automated means for checking as much correctness as possible and assuring consistency for both formats. Just having humans look over it won't cut it. Lots more errors will be present and inconsistency will waste people's time.

Having a length attribute specified only for strings is only likely doable with a customized XML generation process and requires special diffing tools. Even with strings the data is redundant (in at least many cases) with the difference in offsets. So there's redundancy regardless. May as well be complete and consistent. Unless I guess it's possible to document this conditional aspect of the length attribute in the XSD. Then maybe if combined with actually verifying against the XSD.

When can we have Travis enabled so we can have publicly visible checks including against PRs? There's already a config file here, as far as I know it just needs to be turned on over on the Travis site. #36 (comment) Travis is losing favor but it's fine for this and having CI is a critical part of open development.

from models.

bobfox avatar bobfox commented on August 20, 2024

I completely agree that we must have automated ways of generating and checking information model content. I would propose that something that understands the content should do the equivalency checking and validation rather treating the models as equivalent strings of characters. With that approach, the possible differences of syntax to express the same content should not matter. We are working on this to some extent in the next version of pysunspec. Would it work for you if pysunspec could evaluate the equivalence of two model definitions from a content perspective and provide difference information if there are differences? We are currently working on checking model equivalency across all of the encodings: xml, json, csv, Excel. We should be able to release an updated initial version shortly so that everyone can evaluate the approach and implementation.

from models.

altendky avatar altendky commented on August 20, 2024

That would be nice, yes. Still, neglecting the age old textual diff ability when it could be usable simply by a consistent order of attributes doesn't seem a good choice. You are aware that it's not simply 'strings have length and others do not', right? Model 126 for example has length specified for all points. Also, I'm not requesting that you do the work. I can't promise any given date but in general I'm willing to do the fixes, develop related tools, etc.

It's good (in some cases) to be lenient in what you can handle, it's also good to not leverage that flexibility intentionally. That is, a SunSpec client should be able to load a model even if the length of a point isn't specified, but something generating the models ought to be consistent across models and runs etc.

from models.

andig avatar andig commented on August 20, 2024

I would suggest to have a single master only and create different representations automatically. JSON vs XML really does not matter. It would make sense to create json from this repo and carefully analyze the first commit before committing back here if json becomes the master.

from models.

altendky avatar altendky commented on August 20, 2024

Even the original should be run through something that enforces consistency (be it 'always len' or 'len only if string' or the order of attributes, etc). Just as automatic formatters are used for code.

from models.

bobfox avatar bobfox commented on August 20, 2024

For completeness, here is the current SunSpec strategy for model development and support:

  • In general, human viewing and editing of models is done in a spreadsheet (generic or Excel compatible if you want a little extra style)
  • Programattic processing is performed moving forward with the json encoded definitions. Existing model definitions are supported in xml but the intention is to move forward with json for model processing.
  • pysunspec and SunSpec Dashboard will provide a way to perform a semantic diff between any two model definitions between the same or different encodings (xml, json, csv, Excel).
    SunSpec is not currently planning on trying to explicitly support a string diff between models due to the constraints associated with xml and json. Neither xml nor json guarantees the ordering of attributes in elements or objects, respectively. This strategy does make the use of text diff in github and other places less useful but it seems like a reasonable trade off rather than trying to add measures to form the form the xml and json encoding which would preclude using standard libraries for producing the output.
    Just wanted to lay out the complete context as I think it is all relevant to the discussion.

from models.

bobfox avatar bobfox commented on August 20, 2024

I agree with you completely about the xml inconsistency and so not want to repeat that. It is my fault and is mostly due to hand editing to (ironically) keep the current attribute ordering for readability but I did not do a good job checking for consistency. I would propose that we use the updated pysunspec library to generate the json based information models in a standardized way that uses the standard Python json library. It turns out that the json encoded information models are not really very readable in the json form due to their length.

from models.

bobfox avatar bobfox commented on August 20, 2024

One last point I forgot to mention on this one. In the updated modeling specification, it specifies that the 'len' attribute (now called 'size') is only valid for the string type and must not be specified for other types so that should help with this item.

from models.

altendky avatar altendky commented on August 20, 2024

That roughly means you've got a union of two types of points I think which is not a super well supported concept in [de]serialization.

So if you agree consistency is good and we don't have it, what would be bad about a PR being written and accepted to improve it? (presumably written by me, hopefully with a program to apply the consistency) The standards not giving meaning to order seems like it should itself make this a backwards compatible modification.

from models.

bobfox avatar bobfox commented on August 20, 2024

I think there is another way of conceptualizing it. A point definition is used to define different types of points where some point type definitions have no value for one or more of the point definition attributes. I think the concept of [de]serialization an object where some instances of an object may have no value for a given object property is pretty well supported. Another factor for me here is the concept of providing a value in a definition for an attribute that is already known based on the type. It seems like the information is redundant and an opportunity for introducing errors. In this case it seems like the point definition construct is a way to support different type definitions (currently 21) in a single unified construct.

from models.

altendky avatar altendky commented on August 20, 2024

But how do you describe the acceptable sets of attributes in existing [de]serialization tools? Sure, you can make b, e, and f each optional, but how do you make it such that either a, b, and e are required and d and f must be missing _or_ a, d, and f are required and b and e must be missing? Not conceptually but rather in existing [de]serialization tools.

Anyways, even within this hard-to-describe-in-existing-schemas situation (sometimes that is the best choice, sure) the existing models are inconsistent. Do they really have to stay that way? Some models only specify length for strings, others specify length for other types as well. Presumably the length doesn't matter that much and this discussion is just a proxy for overall consistency, programmatic generation with useful standards (sorted attribute output etc), programmatic verification (schemas), automatic execution of the programmatic verification (Travis, GitHub Actions, etc).

from models.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.