Comments (7)
I've been looking this over and have a bunch of questions.
The changes I would like to make seem deeper than the problem would seem to warrant, so I decided to pose them here first.
First, stringstream has no way to pass a char* to use as its internal buffer. This is a problem because at lang/py/support/PythonStream.cpp:68-69 a pointer to the buffer of a recently PyString is provided. However, it's unclear to me what the point of that is. After looking up the usages of SharedPythonOStream (there are a number in py/bindings/math/SparseMatrix.i and py/bindings/algorithms/algorithms_impl.i), it's only used to provide an ostream whose content will later be output as a PyString when the object is closed. To me, the obvious way seems to be creating a stringstream and passing it back to the caller as an ostream, only creating a PyString at the very end with a PyString_FromString (ss.str().c_str())
However, using just a PyString_FromString causes learning to fail under $NTA/bin/run_tests.sh (EDIT: and I don't know why) . Instead, I use PyString_FromStringAndSize(ss.str().c_str(), ss.str().length()) . With that, learning works but there's an off by two problem in svm_test.py under the testPersistentSize function. If I go into nta/algorithms/svm_t.hpp:1506 and change the 6 to an 8, all the tests pass, but I have no idea where an extra two characters would be coming from.
After looking through the various persistent_size functions (say for svm_model or svm_parameter), the purpose seems to be to give the length of the string that save() will produce for the same object, so in persistent_size, why not pass a stringstream to save, then return the length from the stringstream? That would save a lot of apparent code duplication, and greatly appeals to my inner lazy.
Anyway, if you're curious what exactly I was doing, I've pushed my changes into my own fork of nupic.
from nupic.
Hi @danstanton! Thanks for the time you took to investigate this issue. Perhaps it is not as much of a "newbie" problem as we expected. Maybe @breznak or @subutai can comment on your questions.
from nupic.
After looking at the code some more, I think the problem was in nta/algorithms/svm.cpp:81. svm_parameter::persistent_size
was using some custom sprintf code to determine the string-equivalent length of vector<int> weight_label
and vector<float> weight
, but svm_parameter::save
was using the stream operator defined at nta/math/stl_io.hpp:544 . The two must have diverged at some point, and the difference was made up at some point in SharedPythonOStream
. My guess is the correction happened at the two string conversions at the end of close, but I don't know why those would each adjust the string by one character, though the comment says that this section corrects the string for size.
Again, the changes are in my fork. I'm not sure how a pull request is normally done, but I thought you might want me to first revert to the master here, then make all the changes together in a more comprehensive and understandable commit. If that's not necessary, that'd be cool, too.
Anyway, I still have a question about the necessity of throwing an error if the stream collects more characters than expected. Maybe the previous usage strstream had a problem with buffer overflows, but an actual overflow shouldn't be possible now. Is this a thought that should be in the hackers mailing list?
from nupic.
Again, the changes are in my fork. I'm not sure how a pull request is normally done, but I thought you might want me to first revert to the master here, then make all the changes together in a more comprehensive and understandable commit. If that's not necessary, that'd be cool, too.
My suggestion would be to move the changes you made for this ticket into a new remote branch on your fork, and you'll be able to create a PR from that against nupic/master. Refer to this ticket within the description of your PR as described in our Development Process wiki page (by using the fixes #368
mechanism). Once we have a PR, it will be easier for reviewers to directly address the concerns and questions you're bringing up.
You can also just create PR from your master branch like this, but it's a lot cleaner to create a feature branch on your fork and submit the PR from that branch.
Again, thank you for your time and work on this.
from nupic.
Thanks @danstanton . That's some nice investigation!
from nupic.
Any links to my fork of master probably won't work as I've reverted it to numenta's master, and I'll be making a branch for this issue shortly.
from nupic.
#545 passed CI and is ready for review.
from nupic.
Related Issues (20)
- Does nupic work on Pypy? HOT 2
- On Windows 10 tests have undocumented dependency on USER environment variable
- Unable to use AnomalyRegion HOT 1
- 8 xfails on Ubuntu Server 19.04 HOT 1
- KeyError in complete-opf-example.py HOT 1
- swarming in docker not working! HOT 3
- SyntaxError: Invalid Syntax Error While Installing the Library HOT 1
- Fix simple typo: wraped -> wrapped
- Installation using pip install nupic. setuptools dependency error.
- Trying to install nupic on ubuntu 18.04,experiencing mysql client error
- Installation using pip install nupic error
- when can we get the python3 version HOT 1
- Install nupic in windows10 by command: pip install nupic, gaven this error message: HOT 2
- A question about model.run()
- pip install nupic throws error HOT 1
- Dependency issue:: pip2 (python-pip) obsoleted HOT 2
- Dependency Issue
- Supports Python 3 and above versions HOT 1
- pip install nupic throws error on M1 Mac HOT 1
- pip install nupic HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nupic.