Git Product home page Git Product logo

Comments (65)

dcuccia avatar dcuccia commented on June 1, 2024 1

Hi friends. Did a little thinking and have two (hopefully constructive) pieces of feedback on this ticket:

  1. Per our live conversation on naming, I thought a little bit about the naming and namespace location of BinaryArraySerializerFactory. I think it's appropriate. It does what it says - it has one static factory method that creates BinaryArraySerializer(s). And in terms of discovery it's probably in the right place. See this justification, for example.

The alternative would be to put the CreateSerializer implementation into the BinaryArraySerializer class directly, which would be ok with me (but see caveat below). Many classes have factory creation methods. To lead developers into "the pit of success", they also make the class constructor private, so there's "only one way" to create the class (to minimize confusion/maximize correctness). If we did that, we'd probably want to clean up (remove) the public "setter" components of the BinaryArraySerializer, so that there's truly only one way they are instantiated - through the factory method - and mutation (source of bugs) is not allowed.

Caveat - as I've leaned into more "functional programming" modalities, I've tended to see the benefit of the segregation of constructed classes from the "helper" methods that allow their creation/manipulation. Think LINQ - all those Select, Where operators etc are defined in separate classes from the collection classes that "hold" the data.

Another note: I see that the Obsolete attribute warning on the BinaryArraySerializer.DataArray property member is not present on this new feature branch. I think it's important to remove (and remove assignments in all of the implementations), because it incorrectly exposes/guides improper use of the functionality, and is never used/needed anyway.

Lmk what y'all think - I can make any of the above changes if we want.

  1. I think I figured out a relatively safe/extensible way to flatten/remove all the various "for loops" in the factory (487 lines of code down to 107). See my Gist here:

https://gist.github.com/dcuccia/552c52716211cfecaf48dd84fe254f8b

It doesn't change the surface area of the public API nothing in the detector code would change. Could make the factory method consolidation to BinaryArraySerializer a little more palatable, if we were going that direction. I haven't done much testing, but this should be a "good performance" version without any boxing or copying. Let me know what you think.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024 1

Great points Lisa. At Modulim, we go as far as to have separate libraries, e.g. Vts.Test.Unit and Vts.Test.Integration. We run the Unit tests in CI for every branch commit, and run the Integration tests once a (non-draft) PR is created.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024 1

It does. The other way to do it is to flag tests/test classes as integration or unit or e2e, and use a command line switch to run one or the other. We have some very slow running tests that depend on an outside service that we only run before we build a production release candidate, and we use flags for those.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024 1

Right - that's always been the case. The test, in that case, should just confirm the expectation - that GetBinarySerializers() returns an empty, non-null array.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024 1

Thanks for your feedback! I will continue along adding unit tests then.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024 1

I'm half way there with the unit tests! I see that there have been multiple topics brought up in this issue. I have created a list of the order I plan to proceed (this is up for discussion):

  1. complete unit tests of all detectors GetBinarySerializers (all in one file)
  2. update single value detectors to use empty array instead of null and write corresponding GetBinarySerializers unit tests (I tried this today and had trouble, may need help)
  3. write unit test for BinaryArraySerializerFactory (again will gladly welcome any help)
  4. separate detector unit tests into separate files
    ---- at this point merge in branch (option: could merge after 3)
  5. analyze PopulateFromEnumerable and possibly update application to be column-major
  6. reorganize Vts.Test to Vts.Test.Unit, Vts.Test.Integration
    I think the last two are separate topics are beyond this issue and should be solved on other issue branches.
    Let me know if I omitted anything. Just trying to lay out steps for myself.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024 1

Sounds good! :)

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024 1

@lmalenfant that would be super great!

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024 1

I have finished updated all 72 detectors with the new GetBinarySerializers code and written unit tests for them all (included testing what results if TallySecondMoment is set to true and false using neat nunit [TestCase(true)] and [TestCase(false)] attributes that Lisa used in her unit tests for BinaryArraySerializerFactoryTests). These tests are currently in a single file. I plan to break each test into a separate file named by the detector it is testing. For example, ROfRhoDetectorTests.cs. The GetBinarySerializers method will be the only test for ROfRhoDetector in this file, however in the future we can add other unit tests to this file for the other methods in ROfRhoDetector class.

I plan to create a folder under Vts.Test named "Unit". Then in that folder, a subfolder named "MonteCarlo", then a subfolder in that named "Detectors". Then when I separate the unit tests into the separate files, I will put them in this "Detectors" folder and give the tests the namespace Vts.Test.Unit.MonteCarlo.Detectors.

Let me know if you have any comments or questions about this plan.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

We should agree on a design for this up front before implementing - IMO a good reason to have discussions that get promoted to issues.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

I agree but I already started a POC and needed to commit the code somewhere so I created a branch where I can try things out. This will probably take some time and I really need to try things out in code before coming up with a design. If you prefer I work on a fork, I can do that.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Fork could work, or, middle of the road. Could be to rename this ticket to something like "spike - experiment with decoupling detector serialization".

In my opinion, there's more refactoring to consider beyond serialization for detectors, that could make it easier to implement with a lower touch/surface area. This larger piece could interact with our serialization strategy.

I may have a little bit of time opening up to complete the initial python code and work on this with you on a spike branch where we intend to throw away the code and implement the final design we come up with.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@dcuccia thank you for thinking this through! Let me think about your suggestions and I will also take a look at your Gist when I have more time.

Another note: I see that the Obsolete attribute warning on the BinaryArraySerializer.DataArray property member is not present on this new feature branch. I think it's important to remove (and remove assignments in all of the implementations), because it incorrectly exposes/guides improper use of the functionality, and is never used/needed anyway.

Just a quick note on this, we didn't forget it, @hayakawa16 has made a note to add it back in once all the work on the detectors is complete because it is not obsolete until then and it will avoid many warnings on the unchanged detectors while the code is in progress. Good eye noticing that though :)

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Cool, thanks on both accounts!

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I added the Obsolete attribute to DataArray in BinaryArraySerializer and all unit tests run fine. I think that means I have updated all detectors?
So outstanding items on this branch is potential modifications to BinaryArraySerializerFactory and possibly modifying the AllDetectorTests to use each detector class to write and read binary so that other binary (besides Mean and SecondMoment) get tested too (I think this was Lisa's suggestion?). I can work on latter.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Thanks Carole. I think there are still 53 references to BinaryArraySerializer.DataArray throughout the solution. Can we remove these obsolete assignments? (they're not used). For tests, should we create an integration test class for each named detector, to test the specific serialization expectations of that particular detector (and possibly share any boilerplate code needed between those tests)?

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Hi, I did a Find on BinaryArraySerializer.DataArray and selected "Entire Solution" and the only one that is shown is in Vts.xml. How are you searching?

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Are you looking at the feature/112-reconfigure-the-detectors-to-use-global-code-for-the-binary-serializers_2 (note "_2" at end) branch? This is the new branch Lisa and I made that picked out only those changes that support your implementation (other trial code was omitted).

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

I was looking at _2. Is this not the right branch?

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Yes! How did you find those 53 references?

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

VS said there were 53 usages. On my way home and will try to repro there.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Thanks! Also have you pulled the latest? I've been busy updating detectors this week.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Yes, you have! <3

After rebuilding, it says it's actually 49 references:

image

Here's all the examples. These lines can be removed:

image

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Oh, that's weird. Might be a Visual Studio cache issue...sorry stand by.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Wait, now I'm seeing them. Okay, let me work on those detectors. I kind of feel like I modified these detectors already. Well oh well. Deja vu all over again!

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

I cloned fresh and there were only 23, with a couple on the tests side:

image

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Thanks for the helpful screen shot. I modified all of those files. The only place string "DataArray" shows up now is in Vts.xml and [Obsolete] DataArray in BinaryArraySerializer and BinaryArraySerializerFactory.
Unit tests pass and I pushed my fixes.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

This conversation reminded me that when I was working with the serialization code, it doesn't appear that Dimensions in BinaryArraySerializer is used either. @dcuccia & @hayakawa16 Could you you take a look and see if that is correct or if it is being used elsewhere?

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I found no references to BinaryArraySerializer.Dimensions. Shall I add Obsolete attribute?

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Good call - that is no longer needed, and was new as of this effort. Please delete it.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@dcuccia it was not new, it's existing, I noticed it wasn't being used though when I first started the work on the detectors. We should probably put the obsolete attribute on it just in case it was being used externally and then remove it when we remove DataArray.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

My mistake. I think I'd added a GetDimensions method and removed it, which is what had confused me. Sounds good.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I pushed the Obsolete attribute added to Dimensions modification.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I've been thinking about what you said concerning unit tests for this new code: For tests, should we create an integration test class for each named detector, to test the specific serialization expectations of that particular detector (and possibly share any boilerplate code needed between those tests)?

There is existing unit test code called AllDetectorTests. It gets a list of all detectorInputs from the sample input files (from geninfiles). It looks like this:

public void Verify_detector_classes()
{
         // use factory to instantiate detector with CreateDetector and call Initialize
         var tissue = new MultiLayerTissue();
         var rng = new MersenneTwister(0);
         foreach (var detectorInput in _detectorInputs)
         {
                // turn on TallySecondMoment so can flow through that code
                ((dynamic)detectorInput).TallySecondMoment = true;

                // factory generates IDetector using CreateDetector,
                // then calls detector.Initialize method 
                var detector = DetectorFactory.GetDetector(detectorInput, tissue, rng);
                // DetectorIO methods call detector.GetBinarySerializers.WriteData and ReadData
                // check that can write detector
                DetectorIO.WriteDetectorToFile(detector, "");
                // check that can read detector 
                var readDetector = DetectorIO.ReadDetectorFromFile(
                    detector.Name, "");
                Assert.IsInstanceOf<IDetector>(readDetector);
         }
}

I was thinking of extending this test to have each detector GetBinarySerializers tell me what arrays need to be written and then read back in for each detector. This is not what you were suggesting though.

When you say "integration" do you mean testing after running a simulation? If so, is this needed? And I think you are suggesting a separate unit test for each detector rather than this all-in-one test above? I will think on this.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Hi Carole - great thoughts. I have a wall of work this afternoon, but will try to respond tonight or tomorrow morning with some explanation and sample code.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Thanks for your help! Anytime is good.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

My first thought is that there's a lot being tested all at once in these tests. Even if we were just testing a single detector, we'd be exercising:

  1. DetectorFactory.GetDetector
  2. DetectorIO.WriteDetectorToFile
  3. DetectorIO.ReadDetectorFromFile
  4. Tissues and RNGs that aren't relevant for the test

...and all the infrastructure beneath those items. Some would call this type of integration test an "end-to-end" test that tries to exercise the full "stack" of dependencies. And the only thing we confirm at the end is that the detector is not null.

The area of concern we are thinking to add, conversely, is to confirm that the specific properties of the detector (and new ones will have new properties...) are written and read correctly. Only the specific detector "knows" about this, so the specific detector code should probably have "it's own" corresponding test. There are two parts even to testing this smaller component:

a) if we want to test that the BinarySerializer code works along with the associated file IO, then we'd populate known numbers in a known/correct order (but they can be fake - don't need to be accurate from a physics perspective), create a BinarySerializer instance with a BinaryWriter and BinaryReader that are writing/reading to/from file, call the write action, and then read them back with the read action, and confirm the numbers are correct. This would be an "integration" test, because it's testing both the BinarySerializer code, and it's dependency, the file IO functions.

b) if we want to test that the BinarySerializer code works, independent of the file IO, then we'd populate known numbers in a known/correct order, create a BinarySerializer instance with a BinaryWriter and BinaryReader that are writing/reading to/from a "virtual" store like a MemoryStream, call the write action, and then read them back with the read action, and confirm the numbers are correct. This would be a "unit" test, because it's testing that functionality in isolation from any of its dependencies.

Generally, unit tests can be faster and more resilient, because they're testing only a single "layer" of the codebase. And, they often seek to minimize dependencies on IO or async operations where there can be "side effects" and other system-dependent reasons why the code may fail. But on the other hand, integration tests can "make sure" all the pieces work together. Usually, having a combination of them can be helpful - unit tests can give good coverage for each feature, and be used to define the expected behavior of that unit (the best documentation, IMO), and integration tests can be a good reality check. Maybe more computationally expensive (slower to run), and maybe more difficult to debug (because errors can happen anywhere in the stack), but they also build confidence of the entire system.

So, in terms of suggestion, perhapse we try to complement this nice end-to-end test with more isolated unit and integration tests, to verify individual detectors' expectations? I will write a couple examples and post them here.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

At the lower level, here are two examples of how we could unit test each of the accepted BinaryArraySerializer input types:

[TestFixture]
public class BinaryArraySerializerTests
{
    [Test]
    public void Verify_BinaryArraySerializer_roundtrip_Complex1D()
    {
        // create a complex array and a serializer for it
        var dataArray = new Complex[] { (Complex)0, (Complex)1 , (Complex)2 };
        var serializer = BinaryArraySerializerFactory.GetSerializer(dataArray, "testc1d", "_testc1d");

        // write the array to a virtual stream
        using var stream = new MemoryStream();
        using var writer = new BinaryWriter(stream);
        serializer.WriteData(writer);
        
        // zero out the detector values
        Array.Clear(dataArray);
        
        // reset the stream to the beginning
        stream.Seek(0, SeekOrigin.Begin);

        // read the values back in
        var reader = new BinaryReader(stream);
        serializer.ReadData(reader);

        // check that the values are correct
        Assert.AreEqual((Complex)0, dataArray[0]);
        Assert.AreEqual((Complex)1, dataArray[1]);
        Assert.AreEqual((Complex)2, dataArray[2]);
    }

    [Test]
    public void Verify_BinaryArraySerializer_roundtrip_Double2D()
    {
        // create a complex array and a serializer for it
        var dataArray = new double[2,3]
            {
                { 0, 1, 2 },
                { 3, 4, 5 },
            };
        var serializer = BinaryArraySerializerFactory.GetSerializer(dataArray, "testd2d", "_testd2d");

        // write the array to a virtual stream
        using var stream = new MemoryStream();
        using var writer = new BinaryWriter(stream);
        serializer.WriteData(writer);

        // zero out the detector values
        Array.Clear(dataArray);

        // reset the stream to the beginning
        stream.Seek(0, SeekOrigin.Begin);

        // read the values back in
        serializer.ReadData(new BinaryReader(stream));

        // check that the values are correct
        Assert.AreEqual(0.0, dataArray[0, 0]);
        Assert.AreEqual(1.0, dataArray[0, 1]);
        Assert.AreEqual(2.0, dataArray[0, 2]);
        Assert.AreEqual(3.0, dataArray[1, 0]);
        Assert.AreEqual(4.0, dataArray[1, 1]);
        Assert.AreEqual(5.0, dataArray[1, 2]);
    }
}

Aside - I just noticed that "PopulateFromEnumerable" extension methods use the opposite indexing as the detector IO. It's worth checking where we use that method. Typically, the right-most dimension in a multi-dimensional array is written sequentially to disk, as that's "row major" format, and how it's handled in memory in .NET (see the double2d unit test above). PopulateFromEnumerable would be fine for use when copying from one array to another, but if it's used for reading/writing to disk, we should evaluate this discrepancy between that and this new code before merging.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Thank you very much for your analysis of the existing unit test and its drawbacks, and suggestions for added improvements!

I understand your comments about having tests that test just the single layer of the class itself. I noticed that you did not commit this code to test the BinaryArraySerializer class. I will see if I can implement into the branch next week.

Good comment about PopulateFromEnumerable. It looks like it is used by many classes. I guess the key is if those classes write/read to disk. One such class is ArrayCustomBinaryReader.ReadFromBinary and this in turn is called by FileIO.ReadFromBinary (this is not called by anything but its unit test) and DatabaseReader.ReadFromBinary (which uses it in FromFileInResources). I will perform an analysis to make sure this doesn't interfere. I'm somewhat confident all is okay because for all of the detectors, loadMCResults.m loads the binary in "column major" and that has been working.

Thank you for your efforts!

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

DetectorIOTests already has some decent tests for round-tripping the individual detector binary arrays. From a unit test perspective, these are really mostly testing the individual detectors, not whether DetectorIO functions are working. We could improve this by:

  1. segregate a couple unit tests to check that DetectorIO.WriteDetectorToFile and DetectorIO.ReadDetectorFromFile are doing what they're supposed to (i.e. testing the internal logic, writing/reading files correctly, etc). This would be as simple as having one or two dummy detectors with different shapes to exercise the static methods. And then...

  2. narrowing the individual detector tests to unit tests, without any code that depends on DetectorIO, and putting those tests in files named, for example FluenceOfRhoAndZDetectorTests.cs

For example, the original test written like this:

        [Test]
        public void validate_FluenceOfRhoAndZDetector_deserialized_class_is_correct_when_using_WriteReadDetectorToFile()
        {
            string detectorName = "testfluenceofrhoandz";
            IDetectorInput detectorInput = new FluenceOfRhoAndZDetectorInput()
            {
                Rho = new DoubleRange(0, 10, 3),
                Z = new DoubleRange(0, 1, 4),
                TallySecondMoment = false, // tally SecondMoment
                Name = detectorName,
            };
            var detector = (FluenceOfRhoAndZDetector)detectorInput.CreateDetector();
            detector.Mean = new double[,] {{1, 2, 3}, {4, 5, 6}};

            DetectorIO.WriteDetectorToFile(detector, "");
            var dcloned = (FluenceOfRhoAndZDetector)DetectorIO.ReadDetectorFromFile(detectorName, "");

            Assert.AreEqual(detectorName, dcloned.Name);
            Assert.AreEqual(1, dcloned.Mean[0, 0]);
            Assert.AreEqual(2, dcloned.Mean[0, 1]);
            Assert.AreEqual(3, dcloned.Mean[0, 2]);
            Assert.AreEqual(4, dcloned.Mean[1, 0]);
            Assert.AreEqual(5, dcloned.Mean[1, 1]);
            Assert.AreEqual(6, dcloned.Mean[1, 2]);
        }

...could be just slightly re-written like this:

[Test]
public void validate_deserialized_binary_arrays_are_correct()
{
    string detectorName = "testfluenceofrhoandz";
    var detector = new FluenceOfRhoAndZDetector()
    {
        Rho = new DoubleRange(0, 10, 3),
        Z = new DoubleRange(0, 1, 2),
        TallySecondMoment = false, // tally SecondMoment
        Name = detectorName,
    };
    detector.Mean = new double[,] { { 1, 2, 3 }, { 4, 5, 6 } };

    var serializers = detector.GetBinarySerializers();
    using var stream = new MemoryStream();
    using var writer = new BinaryWriter(stream);
    foreach (var serializer in serializers)
    {
        serializer.WriteData(writer);
    }

    Array.Clear(detector.Mean);
    Array.Clear(detector.SecondMoment);

    stream.Seek(0, SeekOrigin.Begin);

    using var reader = new BinaryReader(stream);
    foreach (var serializer in serializers)
    {
        serializer.ReadData(reader);
    }

    Assert.AreEqual(1, detector.Mean[0, 0]);
    Assert.AreEqual(2, detector.Mean[0, 1]);
    Assert.AreEqual(3, detector.Mean[0, 2]);
    Assert.AreEqual(4, detector.Mean[1, 0]);
    Assert.AreEqual(5, detector.Mean[1, 1]);
    Assert.AreEqual(6, detector.Mean[1, 2]);
}

Note that I also limited the test to using the detector class constructor itself, as opposed to relying on a different component (isolation, again).

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I see your improvement! Wouldn't we need this type of unit test for each detector? (the unit test name change to omit FluenceOfRhoAndZDetector confused me).

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

In order to scale productivity for writing and maintaining those tests, we can create a DetectorBinarySerializationHelper class:

internal static class DetectorBinarySerializationHelper
{
    internal static void WriteClearAndReReadArrays(IDetector detector, params Array[] arrays)
    {
        var serializers = detector.GetBinarySerializers();
        using var stream = new MemoryStream();
        using var writer = new BinaryWriter(stream);
        foreach (var serializer in serializers)
        {
            serializer.WriteData(writer);
        }

        foreach (var array in arrays)
        {
            Array.Clear(array);
        }

        stream.Seek(0, SeekOrigin.Begin);

        using var reader = new BinaryReader(stream);
        foreach (var serializer in serializers)
        {
            serializer.ReadData(reader);
        }
    }
}

So that the specialized detector test becomes:

        [Test]
        public void validate_deserialized_binary_arrays_are_correct()
        {
            string detectorName = "testfluenceofrhoandz";
            var detector = new FluenceOfRhoAndZDetector()
            {
                Rho = new DoubleRange(0, 10, 3),
                Z = new DoubleRange(0, 1, 2),
                TallySecondMoment = true, // tally SecondMoment
                Name = detectorName,
            };
            detector.Mean = new double[,] { { 1, 2, 3 }, { 4, 5, 6 } };

            DetectorBinarySerializationHelper.WriteClearAndReReadArrays(detector, detector.Mean, detector.SecondMoment);

            Assert.AreEqual(1, detector.Mean[0, 0]);
            Assert.AreEqual(2, detector.Mean[0, 1]);
            Assert.AreEqual(3, detector.Mean[0, 2]);
            Assert.AreEqual(4, detector.Mean[1, 0]);
            Assert.AreEqual(5, detector.Mean[1, 1]);
            Assert.AreEqual(6, detector.Mean[1, 2]);
        }

In this way, we can share code as much as possible, while having detector-specific code handled by the author of that detector with the specific knowledge of it's expected shape and function.

Finally, as we've talked about, there are definitely more things we can do to use generics to "wrap" the functionality of standard configurations, like classes that only have Mean and StandardDeviation. E.g. we could have an IDetectorWithMeanAndStandardDeviation and then have standardized serialization and class testing based on that shape. But that gets back to the earlier question/concern/debate about complexity, and whether it's worth it. Someone writing and testing a class with that new structure would be fine, but someone doing something more custom wouldn't benefit from the grouping we'd done, and might be confused as to why they have to "jump off the cliff" to write and test their solution.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I understand your concept! The majority of detectors only have Mean and SecondMoment to de/serialize so that would work. For the other detectors, would we just add the other arrays to the "params Array[] arrays"? And then of course, add Assert code.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Yep, that's it exactly - each detector specific test will know which arrays to add to that params array. Sorry I wasn't replying in the middle, I was trying to get a couple posts out in sequence before heading to my niece's play.

Thanks for offering to do an analysis of the populate code. I think we should really decide going forward if we want to flip all the logic here. Probably much better if things are consistent, and keeping legacy compatibility is a big deal.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I rewrote the test and test title for FluenceOfRhoAndZDetector. It looks like this:

        public void Validate_FluenceOfRhoAndZDetector_deserialized_class_is_correct_when_using_GetBinarySerializers()
        {

            const string detectorName = "testfluenceofrhoandz";
            var detector = new FluenceOfRhoAndZDetector
            {
                Rho = new DoubleRange(0, 10, 3),
                Z = new DoubleRange(0, 1, 2),
                TallySecondMoment = true, // tally SecondMoment
                Name = detectorName,
                Mean = new double[,] { { 1, 2, 3 }, { 4, 5, 6 } },
                SecondMoment = new double[,] { { 7, 8, 9}, { 10, 11, 12 }}
            };

            DetectorBinarySerializationHelper.WriteClearAndReReadArrays(detector, detector.Mean, detector.SecondMoment);

            Assert.AreEqual(1, detector.Mean[0, 0]);
            Assert.AreEqual(2, detector.Mean[0, 1]);
            Assert.AreEqual(3, detector.Mean[0, 2]);
            Assert.AreEqual(4, detector.Mean[1, 0]);
            Assert.AreEqual(5, detector.Mean[1, 1]);
            Assert.AreEqual(6, detector.Mean[1, 2]);
            Assert.AreEqual(7, detector.SecondMoment[0, 0]);
            Assert.AreEqual(8, detector.SecondMoment[0, 1]);
            Assert.AreEqual(9, detector.SecondMoment[0, 2]);
            Assert.AreEqual(10, detector.SecondMoment[1, 0]);
            Assert.AreEqual(11, detector.SecondMoment[1, 1]);
            Assert.AreEqual(12, detector.SecondMoment[1, 2]);
        }

I created the class DetectorBinarySerializationHelper per the post above. I wasn't sure where to put it. Currently it is in the Detectors folder, but I could see arguments for it being in the Helpers folder or the IO folder too. Let me know your ideas.

If the above code, looks correct, I will continue on with the other unit tests.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Hi Carole,

I will have time later to check out the code, but I find this type of code is useful to have as a test-only library helper (it's not relevant for the production code, just the tests. It could be in a Helpers folder in the test library, or if multiple test libraries need access to it, create a Vts.Test.Common or similar library.

In terms of location, this would be testing the serialization of that specific class, so I assume we want a 1:1 relationship between the .cs test class and the detector (1 for each). This item unit tests the serialization, another could unit test the accuracy, etc.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@hayakawa16 when I create helper methods for unit testing, I usually create a separate folder called helpers that doesn't have a matching folder in the project being tested. You could put that in the MonteCarlo folder or at the folder level of the Detectors depending on where it will be used.

I also wanted to mention something here about the tests in the VTS. When we were adding test coverage we noticed that the majority of the tests were in fact integration tests and not true unit tests. Where possible, I added unit tests to increase the code coverage but in some case it wasn't possible due to way the original code was written, static classes and methods etc. It would be good once we have the coverage where we want it to be, to identify the areas where we are missing unit tests and add them.

This might be a good time to come up with a structure for the test application so we can easily differentiate unit tests and integration tests. In my projects I name the unit test classes the same as the class being unit tested and then I put the integration tests in a separate location under an IntegrationTests folder because they usually include many classes. Any suggestions for this would be welcome.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@dcuccia I like that idea! I was going to say separate projects but I didn't know if that was overkill. The fact that you used it at Modulim is good because it sounds like it worked well.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I like these ideas. I have more nuts and bolts questions:

  1. for the single value detectors (e.g. ATotalDetector), GetBinarySerializers() => null; is in code. So I plan to leave those tests as they are?
  2. I put DetectorBinarySerializerHelper for now in Vts.Test/MonteCarlo/IO/. We may want to move around later. I just needed it somewhere to update the existing unit tests.
  3. There are currently 18 detectors in DetectorIOTests. I have 54 more tests to add. Can I go ahead and do this and after that we can move things around based on "unit" vs "integration" tests? Or would you rather I do that before I fill out these tests?
  4. These updated 18 unit tests found an error in the way that the new code in GetBinarySerializer was written in some detectors. So that is good! Turns out you need to define the SecondMoment arrays without checking if TallySecondMoment is true, i.e. for ReflectedMTOfRhoAndSubregionHistDetector:
Mean ??= new double[Rho.Count - 1, MTBins.Count - 1];
SecondMoment ??= new double[Rho.Count - 1, MTBins.Count - 1];
var allSerializers = ...

rather than

Mean ??= new double[Rho.Count - 1, MTBins.Count - 1];
if (TallySecondMoment)
{
  SecondMoment ??= new double[Rho.Count - 1, MTBins.Count - 1];
}
var allSerializers =...

Otherwise the SecondMoment array was not in the serializers list in WriteClearAndReReadArrays in DetectorBinarySerializerHelper.

I have pushed code in case you'd like to give it a look.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Defer to Lisa on 2 & 3.

IMO for 1, they should be returning an empty array, not null, and I'd make sure to "move" those tests to the same format as all others (testing serializers is just one part of unit testing the detector class - you'll also want to test correctness, etc).

Re: 4, I don't understand - why does SecondMoment need to be allocated, if it's never used? That said, a "fully-initialized" class is always preferable from a code reliability perspective, I just thought we'd kept those null so that we could use all the available memory for the main detector, if that was a limitation. FWIW, if we intend to keep them possibly null, we should mark with the nullable operator (i.e. double[,]? SecondMoment).

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Regarding 4, you're absolutely correct. When I was debugging code and stepping through, SecondMoment wasn't getting set somehow. I reverted code and all is good now.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Regarding 1, the single valued detectors write their Mean and SecondMoment to the detector.txt ascii file, there is no binary serialization going on. Do you still feel same way?

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

I put DetectorBinarySerializerHelper for now in Vts.Test/MonteCarlo/IO/. We may want to move around later. I just needed it somewhere to update the existing unit tests.

I think that's a good place

There are currently 18 detectors in DetectorIOTests. I have 54 more tests to add. Can I go ahead and do this and after that we can move things around based on "unit" vs "integration" tests? Or would you rather I do that before I fill out these tests?

Yes add them now and we can figure it out later, try to keep integration and unit tests separate even if it is just with a comment in the same file

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

I agree with @dcuccia for 1. returning empty array vs null but if this is too much to change now (this is prevalent in our code and needs resolving globally) we can leave it as is and fix later.

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

I am finding I am coding the same code in multiple tests, so I was thinking of optimizing somehow. At first I tried:

        [Test]
        public void Validate_1DRealDetectors_deserialized_class_is_correct_when_using_GetBinarySerializers()
        {

            var detectors = new List<IDetector>();
            var detectorName = "testrofangle";
            detectors.Add(new ROfAngleDetector
            {
                Angle = new DoubleRange(0, 10, 3),
                TallySecondMoment = true, // tally SecondMoment
                Name = detectorName,
                Mean = new double[] { 1, 2, 3 },
                SecondMoment = new double[] { 4, 5, 6 }
            });
            detectorName = "testrofrho";
            detectors.Add(new ROfRhoDetector
            {
                Rho = new DoubleRange(0, 10, 3),
                TallySecondMoment = true, // tally SecondMoment
                Name = detectorName,
                Mean = new double[] { 1, 2, 3 },
                SecondMoment = new double[] { 4, 5, 6 }
            });
            foreach (var detector in detectors)
            {
                DetectorBinarySerializationHelper.WriteClearAndReReadArrays(detector, detector.Mean, detector.SecondMoment);

                Assert.AreEqual(1, detector.Mean[0]);
                Assert.AreEqual(2, detector.Mean[1]);
                Assert.AreEqual(3, detector.Mean[2]);
                Assert.AreEqual(4, detector.SecondMoment[0]);
                Assert.AreEqual(5, detector.SecondMoment[1]);
                Assert.AreEqual(6, detector.SecondMoment[2]);
            }
        }

but IDetector does not include Mean or SecondMoment so not possible.
So I am thinking about putting the Assert code in a unit test method.
Any thoughts on these optimizations? On the one hand I see that it is great to have an individual, full-contained test for each detector. On the other hand, there is a lot of duplicate code in this unit test file.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

I think this is the unit test for the detector, right? So, my suggestion is a 1:1 relationship between the .cs test class and the detector (1 "XYZDetectorTests.cs" for each class). And in that case, each detector "knows" about it's own things, so you don't need to represent it with an interface, but instead the implementation type. Yes, it's repetitive, but it segregates different detector tests, and will make those files that are "fine" essentially read-only, while new detectors/tests can be written without disturbing the existing ones (more scalable for new work, new devs, etc). We can make more "helper" static libraries, like var rho = DetectorTestHelper.Generate1DRho() etc.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@hayakawa16 Consolidating test code is a great idea, I think I mentioned this last week in our meeting that if we can create generic code for testing using the data we have, we should go that route rather than duplicating the same code in individual tests.

If you would like to work together on this, let me know. We can probably create another helper method that can be called from each test to avoid the duplication.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

(and in the case of the 1D, 2D, etc...those should be tests of DetectorIO libraries, with only one example for each "shape"

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

To respond to @dcuccia, this is the combined test for ROfAngleDetector and ROfRhoDetector. I think you are suggesting not to do this and that the current DetectorIOTests file be separated into one for each detector? If so would that detector test also test methods like Normalize or just GetBinarySerializers?

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

@hayakawa16 all tests relevant to that detector's performance and behavior (and nothing else).

from vts.

hayakawa16 avatar hayakawa16 commented on June 1, 2024

Got it. Let me discuss with @lmalenfant how we organize these types of single layer tests versus the integration tests. In the meantime I will continue filling out the tests in DetectorIOTests so that I can verify new code, and I can separate later.

from vts.

dcuccia avatar dcuccia commented on June 1, 2024

Sounds good!

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

This looks good @hayakawa16 although I think 2. could be further down the list because this is part of a larger cleanup effort. Plus the detectors that contain this code do not have the serialization code and are not being modified with this change.

Since this is already a huge change we need to keep these changes to a minimum because they will be easier to review in the PR.

I would choose the option to merge after 3.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@hayakawa16 would you like me to write the unit tests for BinaryArraySerializerFactory?

@dcuccia I was able to look at your Gist finally and I like it, if we write the unit tests for the code as is, we can switch it for the new code and validate that it still works the same way.

from vts.

lmalenfant avatar lmalenfant commented on June 1, 2024

@hayakawa16 That sounds good to me.

from vts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.