Git Product home page Git Product logo

Comments (5)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 16, 2024

Original comment by [email protected] on 20 May 2012 at 12:17

  • Changed state: Accepted

from mosh-scheme.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 16, 2024
Now our transcoder implementation has state in itself.
You can see bool beginningOfInput_ variable at Transcoder.h
It causes this strange behavior.

We should remove these states from Transcoder.

For now we don't have enough time to fix this.
Mark this as Milestone-Release0.3.0.

Your patch is always welcome.

Thanks,
Higepon


Original comment by [email protected] on 20 May 2012 at 7:34

  • Added labels: Milestone-Release0.3.0

from mosh-scheme.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 16, 2024
Hi,

I have made the patch file for 0.2.7. I'm not sure if the coding style is good 
or not.
The basic idea it, let BinaryInputPort have position and all the subclass 
manage it. Then when Transcoder is reading a character check if the position is 
0 or not and if it's 0 then check BOM.

diff -Naru mosh-0.2.7.org/src/BinaryInputPort.h mosh-0.2.7/src/BinaryInputPort.h
--- mosh-0.2.7.org/src/BinaryInputPort.h    2012-05-20 12:55:44.578125000 +0200
+++ mosh-0.2.7/src/BinaryInputPort.h    2012-05-20 13:29:36.593750000 +0200
@@ -41,12 +41,24 @@
 class BinaryInputPort : virtual public BinaryPort // for closing port on destructors, we extend gc_cleanup
 {
 public:
+    BinaryInputPort() : position_(0) {}
     virtual ~BinaryInputPort() {}
     virtual int getU8() = 0;
     virtual int lookaheadU8() = 0;
     virtual int64_t readBytes(uint8_t* buf, int64_t reqSize, bool& isErrorOccured) = 0;
     virtual int64_t readSome(uint8_t** buf, bool& isErrorOccured) = 0;
     virtual int64_t readAll(uint8_t** buf, bool& isErrorOccured) = 0;
+
+    /* Added by Takashi Kato to resolve Transcoder state problem */
+    /**
+       Returns current position of this binary input port.
+     */
+    int64_t getMark() { return position_; }
+protected:
+    void    setMark(int64_t pos) { position_ = pos; }
+    void    addMark(int64_t offset) { position_ += offset; }
+    /* for FileBinaryInputPort, this must be protected */
+    int64_t position_;
 };

 } // namespace scheme
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.cpp 
mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.cpp    2012-05-20 
12:55:44.625000000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.cpp    2012-05-20 
13:20:38.765625000 +0200
@@ -72,7 +72,6 @@
     fileName_(file),
     buffer_(NULL),
     isDirty_(false),
-    position_(0),
     isClosed_(false),
     isPseudoClosed_(false),
     bufferSize_(0),
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.h 
mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.h
--- mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.h  2012-05-20 
12:55:44.625000000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.h  2012-05-20 
13:21:22.156250000 +0200
@@ -92,7 +92,6 @@
     ucs4string fileName_;
     uint8_t* buffer_;
     bool isDirty_;
-    int64_t position_;
     bool isClosed_;
     bool isPseudoClosed_;
     int64_t bufferSize_;
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputPort.cpp 
mosh-0.2.7/src/BufferedFileBinaryInputPort.cpp
--- mosh-0.2.7.org/src/BufferedFileBinaryInputPort.cpp  2012-05-20 
12:55:44.640625000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputPort.cpp  2012-05-20 
13:21:00.437500000 +0200
@@ -50,18 +50,18 @@

 using namespace scheme;

-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(File* file) : 
file_(file), fileName_(UC("<unknown file>")), isClosed_(false), 
isPseudoClosed_(false), position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(File* file) : 
file_(file), fileName_(UC("<unknown file>")), isClosed_(false), 
isPseudoClosed_(false)
 {
     initializeBuffer();
 }

-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(ucs4string file) : 
file_(new File), fileName_(file), isClosed_(false), isPseudoClosed_(false), 
position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(ucs4string file) : 
file_(new File), fileName_(file), isClosed_(false), isPseudoClosed_(false)
 {
     file_->open(fileName_, File::Read);
     initializeBuffer();
 }

-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(const char* file) : 
file_(new File), isClosed_(false), isPseudoClosed_(false), position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(const char* file) : 
file_(new File), isClosed_(false), isPseudoClosed_(false)
 {
     fileName_ = ucs4string::from_c_str(file);
     file_->open(fileName_, File::Read);
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputPort.h 
mosh-0.2.7/src/BufferedFileBinaryInputPort.h
--- mosh-0.2.7.org/src/BufferedFileBinaryInputPort.h    2012-05-20 
12:55:44.640625000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputPort.h    2012-05-20 13:20:18.859375000 
+0200
@@ -83,7 +83,6 @@
     uint8_t* buffer_;
     int64_t bufLen_;
     int64_t bufIdx_;
-    int64_t position_;
 };

 } // namespace scheme
diff -Naru mosh-0.2.7.org/src/CustomBinaryInputOutputPort.cpp 
mosh-0.2.7/src/CustomBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/CustomBinaryInputOutputPort.cpp  2012-05-20 
12:55:44.750000000 +0200
+++ mosh-0.2.7/src/CustomBinaryInputOutputPort.cpp  2012-05-20 
13:39:22.531250000 +0200
@@ -109,6 +109,7 @@
     if (hasAheadU8()) {
         int c = aheadU8_;
         aheadU8_ = EOF;
+   addMark(1);
         return c;
     }

@@ -117,6 +118,7 @@
     const Object count = Object::makeFixnum(1);
     const Object result = theVM_->callClosure3(readProc_, bv, start, count);
     MOSH_ASSERT(result.isFixnum());
+    addMark(1);
     if (0 == result.toFixnum()) {
         return EOF;
     }
@@ -219,6 +221,7 @@
     // we need to reset the aheadU8_
     aheadU8_ = EOF;
     theVM_->callClosure1(setPositionProc_, Bignum::makeIntegerFromS64(position));
+    setMark(position);
     return true;
 }

diff -Naru mosh-0.2.7.org/src/CustomBinaryInputPort.cpp 
mosh-0.2.7/src/CustomBinaryInputPort.cpp
--- mosh-0.2.7.org/src/CustomBinaryInputPort.cpp    2012-05-20 12:55:44.750000000 
+0200
+++ mosh-0.2.7/src/CustomBinaryInputPort.cpp    2012-05-20 13:39:18.375000000 +0200
@@ -101,6 +101,7 @@
     if (hasAheadU8()) {
         int c = aheadU8_;
         aheadU8_ = EOF;
+   addMark(1);
         return c;
     }

@@ -109,6 +110,7 @@
     const Object count = Object::makeFixnum(1);
     const Object result = theVM_->callClosure3(readProc_, bv, start, count);
     MOSH_ASSERT(result.isFixnum());
+    addMark(1);
     if (0 == result.toFixnum()) {
         return EOF;
     }
@@ -211,5 +213,6 @@
     // we need to reset the aheadU8_
     aheadU8_ = EOF;
     theVM_->callClosure1(setPositionProc_, Bignum::makeIntegerFromS64(position));
+    setMark(position);
     return true;
 }
diff -Naru mosh-0.2.7.org/src/FileBinaryInputOutputPort.cpp 
mosh-0.2.7/src/FileBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/FileBinaryInputOutputPort.cpp    2012-05-20 
12:55:45.000000000 +0200
+++ mosh-0.2.7/src/FileBinaryInputOutputPort.cpp    2012-05-20 13:53:57.046875000 
+0200
@@ -102,6 +102,7 @@
 bool FileBinaryInputOutputPort::setPosition(int64_t position)
 {
     const int64_t currentOffset = file_->seek(position);
+    position_ = currentOffset;
     if (position == currentOffset) {
         return true;
     } else {
@@ -139,6 +140,7 @@
     if (0 == file_->read(&c, 1)) {
         return EOF;
     } else {
+        position_++;       // to avoid reading multiple BOM
         return c;
     }
 }
@@ -170,6 +172,7 @@
 int64_t FileBinaryInputOutputPort::readBytes(uint8_t* buf, int64_t reqSize, bool& isErrorOccured)
 {
     const int64_t readSize = file_->read(buf, reqSize);
+    position_ += readSize;
     return readSize;
 }

@@ -185,6 +188,7 @@
     uint8_t* dest = allocatePointerFreeU8Array(restSize);
     const int64_t readSize = file_->read(dest, restSize);
     *buf = dest;
+    position_ = readSize;
     return readSize;
 }

diff -Naru mosh-0.2.7.org/src/FileBinaryInputPort.cpp 
mosh-0.2.7/src/FileBinaryInputPort.cpp
--- mosh-0.2.7.org/src/FileBinaryInputPort.cpp  2012-05-20 12:55:45.000000000 
+0200
+++ mosh-0.2.7/src/FileBinaryInputPort.cpp  2012-05-20 13:18:36.031250000 +0200
@@ -55,16 +55,16 @@

 using namespace scheme;

-FileBinaryInputPort::FileBinaryInputPort(File* file) : file_(file), 
fileName_(UC("<unknown file>")), isClosed_(false), isPseudoClosed_(false), 
aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(File* file) : file_(file), 
fileName_(UC("<unknown file>")), isClosed_(false), isPseudoClosed_(false), 
aheadU8_(EOF)
 {
 }

-FileBinaryInputPort::FileBinaryInputPort(const ucs4string& file) : file_(new 
File), fileName_(file), isClosed_(false), isPseudoClosed_(false), 
aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(const ucs4string& file) : file_(new 
File), fileName_(file), isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF)
 {
     file_->open(file, File::Read);
 }

-FileBinaryInputPort::FileBinaryInputPort(const char* file) : file_(new File), 
isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(const char* file) : file_(new File), 
isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF)
 {
     fileName_ = ucs4string::from_c_str(file);
     file_->open(fileName_, File::Read);
diff -Naru mosh-0.2.7.org/src/FileBinaryInputPort.h 
mosh-0.2.7/src/FileBinaryInputPort.h
--- mosh-0.2.7.org/src/FileBinaryInputPort.h    2012-05-20 12:55:45.015625000 +0200
+++ mosh-0.2.7/src/FileBinaryInputPort.h    2012-05-20 12:58:12.703125000 +0200
@@ -76,7 +76,6 @@
     bool isClosed_;
     bool isPseudoClosed_;
     int aheadU8_;
-    int64_t position_;
 };

 } // namespace scheme
diff -Naru mosh-0.2.7.org/src/SocketBinaryInputOutputPort.cpp 
mosh-0.2.7/src/SocketBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/SocketBinaryInputOutputPort.cpp  2012-05-20 
12:55:46.031250000 +0200
+++ mosh-0.2.7/src/SocketBinaryInputOutputPort.cpp  2012-05-20 
13:08:38.390625000 +0200
@@ -135,6 +135,7 @@
     } else {
         uint8_t c;
         const int ret = socket_->receive(&c, 1, 0);
+   addMark(ret);
         if (0 == ret) {
             return EOF;
         } else if (-1 == ret) {
@@ -165,6 +166,7 @@
         throwIOError2(IOError::READ, socket_->getLastErrorMessage());
         return -1;
     }
+    addMark(readSize);
     return readSize;
 }

@@ -186,6 +188,7 @@
             }
         }
     }
+    addMark(data.size());
     uint8_t* dest = allocatePointerFreeU8Array(data.size());
     for (size_t i = 0; i < data.size(); i++) {
         dest[i] = data[i];
diff -Naru mosh-0.2.7.org/src/stamp-h1 mosh-0.2.7/src/stamp-h1
--- mosh-0.2.7.org/src/stamp-h1 1970-01-01 01:00:00.000000000 +0100
+++ mosh-0.2.7/src/stamp-h1 2012-05-20 13:02:44.343750000 +0200
@@ -0,0 +1 @@
+timestamp for src/config.h
diff -Naru mosh-0.2.7.org/src/Transcoder.cpp mosh-0.2.7/src/Transcoder.cpp
--- mosh-0.2.7.org/src/Transcoder.cpp   2012-05-20 12:55:46.421875000 +0200
+++ mosh-0.2.7/src/Transcoder.cpp   2012-05-20 13:14:27.953125000 +0200
@@ -35,6 +35,7 @@
 #include "Object-inl.h"
 #include "SString.h"
 #include "Symbol.h"
+#include "BinaryInputPort.h"
 #include "BinaryOutputPort.h"
 #include "Transcoder.h"
 #include "UTF8Codec.h"
@@ -42,7 +43,6 @@
 using namespace scheme;

 Transcoder::Transcoder(Codec* codec) :
-    beginningOfInput_(true),
     codec_(codec),
     eolStyle_(EolStyle(LF)), // LF means no convert.
     errorHandlingMode_(ErrorHandlingMode(REPLACE_ERROR)),
@@ -51,7 +51,6 @@
 }

 Transcoder::Transcoder(Codec* codec, EolStyle eolStyle) :
-    beginningOfInput_(true),
     codec_(codec),
     eolStyle_(eolStyle),
     errorHandlingMode_(ErrorHandlingMode(REPLACE_ERROR)),
@@ -60,7 +59,6 @@
 }

 Transcoder::Transcoder(Codec* codec, EolStyle eolStyle, enum ErrorHandlingMode errorHandlingMode) :
-    beginningOfInput_(true),
     codec_(codec),
     eolStyle_(eolStyle),
     errorHandlingMode_(errorHandlingMode),
@@ -210,8 +208,8 @@
 ucs4char Transcoder::getCharInternal(BinaryInputPort* port)
 {
     // In the beginning of input, we have to check the BOM.
-    if (beginningOfInput_) {
-        beginningOfInput_ = false;
+    /* Transcoder must not have the state to check port's beginning. */
+    if (port->getMark() == 0) {
         const bool checkBOM = true;
         return codec_->getChar(port, errorHandlingMode_, checkBOM);
     }
diff -Naru mosh-0.2.7.org/src/Transcoder.h mosh-0.2.7/src/Transcoder.h
--- mosh-0.2.7.org/src/Transcoder.h 2012-05-20 12:55:46.437500000 +0200
+++ mosh-0.2.7/src/Transcoder.h 2012-05-20 13:11:52.656250000 +0200
@@ -70,7 +70,6 @@
     static Object eolStyleToSymbol(const enum EolStyle eolstyle);
     static Object errorHandlingModeToSymbol(enum ErrorHandlingMode errorHandlingMode);

-    bool beginningOfInput_;
     Codec* codec_;
     enum EolStyle eolStyle_;
     enum ErrorHandlingMode errorHandlingMode_;
diff -Naru mosh-0.2.7.org/tests/input-port.scm mosh-0.2.7/tests/input-port.scm
--- mosh-0.2.7.org/tests/input-port.scm 2012-05-20 12:55:46.843750000 +0200
+++ mosh-0.2.7/tests/input-port.scm 2012-05-20 13:42:39.265625000 +0200
@@ -287,4 +287,13 @@
   (test-equal "" (get-string-n i2 0))
   (test-equal "" (get-string-n i3 0)))

+;; For UTF16 BOM issue. Added by Takashi Kato
+(let ((tr (make-transcoder (utf-16-codec))))
+  (test-equal (call-with-port 
+          (open-file-input-port "./tests/utf16.txt" (file-options) 'block tr)
+          (lambda (in) (get-char in)))
+         (call-with-port 
+          (open-file-input-port "./tests/utf16.txt" (file-options) 'block tr)
+          (lambda (in) (get-char in)))))
+
 (test-results)

Thanks,

Original comment by [email protected] on 20 May 2012 at 12:11

from mosh-scheme.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 16, 2024
Hi, thank you for your patch.
Your approach sounds good to me.

Two requests and one question.

(1) Can you rename them for more intention revealing?

setMark() -> setPosition()
getMark() -> getPosition()
addMark() -> forwardPosition()

(2) Please make sure it passes all port related tests.

(3) Do you use github? If so, can you send a pull request?
If not, just attach new patch here.

Cheers

Original comment by [email protected] on 21 May 2012 at 11:42

from mosh-scheme.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 16, 2024
Hi Higepon,

Sorry, I don't have Github account and any intention to make it right now.

For (1)
The reason why I named those methods is there were already the same name 
methods there and it's used for the real purpose, so I didn't want to mess up 
the working code. (Plus, I don't have much time to spend for it either, sorry.)

For (2)
On my main environment (Cygwin), I even could not pass the all tests from the 
beginning, so I have only checked until it could continue and input-port.scm, 
output-port.scm and input-output-port.scm these passed all tests.

I have also noticed that Transcoder must not have the buffer property. It 
causes re-using a transcoder problem as well. It might be good to resolve the 
problem with this issue. I think TextualPort can have the buffer instead of 
letting Transcoder have it.

I'm looking forward to the next release.

Thanks!

Original comment by [email protected] on 21 May 2012 at 7:05

from mosh-scheme.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.