Comments (5)
Original comment by [email protected]
on 20 May 2012 at 12:17
- Changed state: Accepted
from mosh-scheme.
Now our transcoder implementation has state in itself.
You can see bool beginningOfInput_ variable at Transcoder.h
It causes this strange behavior.
We should remove these states from Transcoder.
For now we don't have enough time to fix this.
Mark this as Milestone-Release0.3.0.
Your patch is always welcome.
Thanks,
Higepon
Original comment by [email protected]
on 20 May 2012 at 7:34
- Added labels: Milestone-Release0.3.0
from mosh-scheme.
Hi,
I have made the patch file for 0.2.7. I'm not sure if the coding style is good
or not.
The basic idea it, let BinaryInputPort have position and all the subclass
manage it. Then when Transcoder is reading a character check if the position is
0 or not and if it's 0 then check BOM.
diff -Naru mosh-0.2.7.org/src/BinaryInputPort.h mosh-0.2.7/src/BinaryInputPort.h
--- mosh-0.2.7.org/src/BinaryInputPort.h 2012-05-20 12:55:44.578125000 +0200
+++ mosh-0.2.7/src/BinaryInputPort.h 2012-05-20 13:29:36.593750000 +0200
@@ -41,12 +41,24 @@
class BinaryInputPort : virtual public BinaryPort // for closing port on destructors, we extend gc_cleanup
{
public:
+ BinaryInputPort() : position_(0) {}
virtual ~BinaryInputPort() {}
virtual int getU8() = 0;
virtual int lookaheadU8() = 0;
virtual int64_t readBytes(uint8_t* buf, int64_t reqSize, bool& isErrorOccured) = 0;
virtual int64_t readSome(uint8_t** buf, bool& isErrorOccured) = 0;
virtual int64_t readAll(uint8_t** buf, bool& isErrorOccured) = 0;
+
+ /* Added by Takashi Kato to resolve Transcoder state problem */
+ /**
+ Returns current position of this binary input port.
+ */
+ int64_t getMark() { return position_; }
+protected:
+ void setMark(int64_t pos) { position_ = pos; }
+ void addMark(int64_t offset) { position_ += offset; }
+ /* for FileBinaryInputPort, this must be protected */
+ int64_t position_;
};
} // namespace scheme
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.cpp
mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.cpp 2012-05-20
12:55:44.625000000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.cpp 2012-05-20
13:20:38.765625000 +0200
@@ -72,7 +72,6 @@
fileName_(file),
buffer_(NULL),
isDirty_(false),
- position_(0),
isClosed_(false),
isPseudoClosed_(false),
bufferSize_(0),
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.h
mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.h
--- mosh-0.2.7.org/src/BufferedFileBinaryInputOutputPort.h 2012-05-20
12:55:44.625000000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputOutputPort.h 2012-05-20
13:21:22.156250000 +0200
@@ -92,7 +92,6 @@
ucs4string fileName_;
uint8_t* buffer_;
bool isDirty_;
- int64_t position_;
bool isClosed_;
bool isPseudoClosed_;
int64_t bufferSize_;
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputPort.cpp
mosh-0.2.7/src/BufferedFileBinaryInputPort.cpp
--- mosh-0.2.7.org/src/BufferedFileBinaryInputPort.cpp 2012-05-20
12:55:44.640625000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputPort.cpp 2012-05-20
13:21:00.437500000 +0200
@@ -50,18 +50,18 @@
using namespace scheme;
-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(File* file) :
file_(file), fileName_(UC("<unknown file>")), isClosed_(false),
isPseudoClosed_(false), position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(File* file) :
file_(file), fileName_(UC("<unknown file>")), isClosed_(false),
isPseudoClosed_(false)
{
initializeBuffer();
}
-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(ucs4string file) :
file_(new File), fileName_(file), isClosed_(false), isPseudoClosed_(false),
position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(ucs4string file) :
file_(new File), fileName_(file), isClosed_(false), isPseudoClosed_(false)
{
file_->open(fileName_, File::Read);
initializeBuffer();
}
-BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(const char* file) :
file_(new File), isClosed_(false), isPseudoClosed_(false), position_(0)
+BufferedFileBinaryInputPort::BufferedFileBinaryInputPort(const char* file) :
file_(new File), isClosed_(false), isPseudoClosed_(false)
{
fileName_ = ucs4string::from_c_str(file);
file_->open(fileName_, File::Read);
diff -Naru mosh-0.2.7.org/src/BufferedFileBinaryInputPort.h
mosh-0.2.7/src/BufferedFileBinaryInputPort.h
--- mosh-0.2.7.org/src/BufferedFileBinaryInputPort.h 2012-05-20
12:55:44.640625000 +0200
+++ mosh-0.2.7/src/BufferedFileBinaryInputPort.h 2012-05-20 13:20:18.859375000
+0200
@@ -83,7 +83,6 @@
uint8_t* buffer_;
int64_t bufLen_;
int64_t bufIdx_;
- int64_t position_;
};
} // namespace scheme
diff -Naru mosh-0.2.7.org/src/CustomBinaryInputOutputPort.cpp
mosh-0.2.7/src/CustomBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/CustomBinaryInputOutputPort.cpp 2012-05-20
12:55:44.750000000 +0200
+++ mosh-0.2.7/src/CustomBinaryInputOutputPort.cpp 2012-05-20
13:39:22.531250000 +0200
@@ -109,6 +109,7 @@
if (hasAheadU8()) {
int c = aheadU8_;
aheadU8_ = EOF;
+ addMark(1);
return c;
}
@@ -117,6 +118,7 @@
const Object count = Object::makeFixnum(1);
const Object result = theVM_->callClosure3(readProc_, bv, start, count);
MOSH_ASSERT(result.isFixnum());
+ addMark(1);
if (0 == result.toFixnum()) {
return EOF;
}
@@ -219,6 +221,7 @@
// we need to reset the aheadU8_
aheadU8_ = EOF;
theVM_->callClosure1(setPositionProc_, Bignum::makeIntegerFromS64(position));
+ setMark(position);
return true;
}
diff -Naru mosh-0.2.7.org/src/CustomBinaryInputPort.cpp
mosh-0.2.7/src/CustomBinaryInputPort.cpp
--- mosh-0.2.7.org/src/CustomBinaryInputPort.cpp 2012-05-20 12:55:44.750000000
+0200
+++ mosh-0.2.7/src/CustomBinaryInputPort.cpp 2012-05-20 13:39:18.375000000 +0200
@@ -101,6 +101,7 @@
if (hasAheadU8()) {
int c = aheadU8_;
aheadU8_ = EOF;
+ addMark(1);
return c;
}
@@ -109,6 +110,7 @@
const Object count = Object::makeFixnum(1);
const Object result = theVM_->callClosure3(readProc_, bv, start, count);
MOSH_ASSERT(result.isFixnum());
+ addMark(1);
if (0 == result.toFixnum()) {
return EOF;
}
@@ -211,5 +213,6 @@
// we need to reset the aheadU8_
aheadU8_ = EOF;
theVM_->callClosure1(setPositionProc_, Bignum::makeIntegerFromS64(position));
+ setMark(position);
return true;
}
diff -Naru mosh-0.2.7.org/src/FileBinaryInputOutputPort.cpp
mosh-0.2.7/src/FileBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/FileBinaryInputOutputPort.cpp 2012-05-20
12:55:45.000000000 +0200
+++ mosh-0.2.7/src/FileBinaryInputOutputPort.cpp 2012-05-20 13:53:57.046875000
+0200
@@ -102,6 +102,7 @@
bool FileBinaryInputOutputPort::setPosition(int64_t position)
{
const int64_t currentOffset = file_->seek(position);
+ position_ = currentOffset;
if (position == currentOffset) {
return true;
} else {
@@ -139,6 +140,7 @@
if (0 == file_->read(&c, 1)) {
return EOF;
} else {
+ position_++; // to avoid reading multiple BOM
return c;
}
}
@@ -170,6 +172,7 @@
int64_t FileBinaryInputOutputPort::readBytes(uint8_t* buf, int64_t reqSize, bool& isErrorOccured)
{
const int64_t readSize = file_->read(buf, reqSize);
+ position_ += readSize;
return readSize;
}
@@ -185,6 +188,7 @@
uint8_t* dest = allocatePointerFreeU8Array(restSize);
const int64_t readSize = file_->read(dest, restSize);
*buf = dest;
+ position_ = readSize;
return readSize;
}
diff -Naru mosh-0.2.7.org/src/FileBinaryInputPort.cpp
mosh-0.2.7/src/FileBinaryInputPort.cpp
--- mosh-0.2.7.org/src/FileBinaryInputPort.cpp 2012-05-20 12:55:45.000000000
+0200
+++ mosh-0.2.7/src/FileBinaryInputPort.cpp 2012-05-20 13:18:36.031250000 +0200
@@ -55,16 +55,16 @@
using namespace scheme;
-FileBinaryInputPort::FileBinaryInputPort(File* file) : file_(file),
fileName_(UC("<unknown file>")), isClosed_(false), isPseudoClosed_(false),
aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(File* file) : file_(file),
fileName_(UC("<unknown file>")), isClosed_(false), isPseudoClosed_(false),
aheadU8_(EOF)
{
}
-FileBinaryInputPort::FileBinaryInputPort(const ucs4string& file) : file_(new
File), fileName_(file), isClosed_(false), isPseudoClosed_(false),
aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(const ucs4string& file) : file_(new
File), fileName_(file), isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF)
{
file_->open(file, File::Read);
}
-FileBinaryInputPort::FileBinaryInputPort(const char* file) : file_(new File),
isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF), position_(0)
+FileBinaryInputPort::FileBinaryInputPort(const char* file) : file_(new File),
isClosed_(false), isPseudoClosed_(false), aheadU8_(EOF)
{
fileName_ = ucs4string::from_c_str(file);
file_->open(fileName_, File::Read);
diff -Naru mosh-0.2.7.org/src/FileBinaryInputPort.h
mosh-0.2.7/src/FileBinaryInputPort.h
--- mosh-0.2.7.org/src/FileBinaryInputPort.h 2012-05-20 12:55:45.015625000 +0200
+++ mosh-0.2.7/src/FileBinaryInputPort.h 2012-05-20 12:58:12.703125000 +0200
@@ -76,7 +76,6 @@
bool isClosed_;
bool isPseudoClosed_;
int aheadU8_;
- int64_t position_;
};
} // namespace scheme
diff -Naru mosh-0.2.7.org/src/SocketBinaryInputOutputPort.cpp
mosh-0.2.7/src/SocketBinaryInputOutputPort.cpp
--- mosh-0.2.7.org/src/SocketBinaryInputOutputPort.cpp 2012-05-20
12:55:46.031250000 +0200
+++ mosh-0.2.7/src/SocketBinaryInputOutputPort.cpp 2012-05-20
13:08:38.390625000 +0200
@@ -135,6 +135,7 @@
} else {
uint8_t c;
const int ret = socket_->receive(&c, 1, 0);
+ addMark(ret);
if (0 == ret) {
return EOF;
} else if (-1 == ret) {
@@ -165,6 +166,7 @@
throwIOError2(IOError::READ, socket_->getLastErrorMessage());
return -1;
}
+ addMark(readSize);
return readSize;
}
@@ -186,6 +188,7 @@
}
}
}
+ addMark(data.size());
uint8_t* dest = allocatePointerFreeU8Array(data.size());
for (size_t i = 0; i < data.size(); i++) {
dest[i] = data[i];
diff -Naru mosh-0.2.7.org/src/stamp-h1 mosh-0.2.7/src/stamp-h1
--- mosh-0.2.7.org/src/stamp-h1 1970-01-01 01:00:00.000000000 +0100
+++ mosh-0.2.7/src/stamp-h1 2012-05-20 13:02:44.343750000 +0200
@@ -0,0 +1 @@
+timestamp for src/config.h
diff -Naru mosh-0.2.7.org/src/Transcoder.cpp mosh-0.2.7/src/Transcoder.cpp
--- mosh-0.2.7.org/src/Transcoder.cpp 2012-05-20 12:55:46.421875000 +0200
+++ mosh-0.2.7/src/Transcoder.cpp 2012-05-20 13:14:27.953125000 +0200
@@ -35,6 +35,7 @@
#include "Object-inl.h"
#include "SString.h"
#include "Symbol.h"
+#include "BinaryInputPort.h"
#include "BinaryOutputPort.h"
#include "Transcoder.h"
#include "UTF8Codec.h"
@@ -42,7 +43,6 @@
using namespace scheme;
Transcoder::Transcoder(Codec* codec) :
- beginningOfInput_(true),
codec_(codec),
eolStyle_(EolStyle(LF)), // LF means no convert.
errorHandlingMode_(ErrorHandlingMode(REPLACE_ERROR)),
@@ -51,7 +51,6 @@
}
Transcoder::Transcoder(Codec* codec, EolStyle eolStyle) :
- beginningOfInput_(true),
codec_(codec),
eolStyle_(eolStyle),
errorHandlingMode_(ErrorHandlingMode(REPLACE_ERROR)),
@@ -60,7 +59,6 @@
}
Transcoder::Transcoder(Codec* codec, EolStyle eolStyle, enum ErrorHandlingMode errorHandlingMode) :
- beginningOfInput_(true),
codec_(codec),
eolStyle_(eolStyle),
errorHandlingMode_(errorHandlingMode),
@@ -210,8 +208,8 @@
ucs4char Transcoder::getCharInternal(BinaryInputPort* port)
{
// In the beginning of input, we have to check the BOM.
- if (beginningOfInput_) {
- beginningOfInput_ = false;
+ /* Transcoder must not have the state to check port's beginning. */
+ if (port->getMark() == 0) {
const bool checkBOM = true;
return codec_->getChar(port, errorHandlingMode_, checkBOM);
}
diff -Naru mosh-0.2.7.org/src/Transcoder.h mosh-0.2.7/src/Transcoder.h
--- mosh-0.2.7.org/src/Transcoder.h 2012-05-20 12:55:46.437500000 +0200
+++ mosh-0.2.7/src/Transcoder.h 2012-05-20 13:11:52.656250000 +0200
@@ -70,7 +70,6 @@
static Object eolStyleToSymbol(const enum EolStyle eolstyle);
static Object errorHandlingModeToSymbol(enum ErrorHandlingMode errorHandlingMode);
- bool beginningOfInput_;
Codec* codec_;
enum EolStyle eolStyle_;
enum ErrorHandlingMode errorHandlingMode_;
diff -Naru mosh-0.2.7.org/tests/input-port.scm mosh-0.2.7/tests/input-port.scm
--- mosh-0.2.7.org/tests/input-port.scm 2012-05-20 12:55:46.843750000 +0200
+++ mosh-0.2.7/tests/input-port.scm 2012-05-20 13:42:39.265625000 +0200
@@ -287,4 +287,13 @@
(test-equal "" (get-string-n i2 0))
(test-equal "" (get-string-n i3 0)))
+;; For UTF16 BOM issue. Added by Takashi Kato
+(let ((tr (make-transcoder (utf-16-codec))))
+ (test-equal (call-with-port
+ (open-file-input-port "./tests/utf16.txt" (file-options) 'block tr)
+ (lambda (in) (get-char in)))
+ (call-with-port
+ (open-file-input-port "./tests/utf16.txt" (file-options) 'block tr)
+ (lambda (in) (get-char in)))))
+
(test-results)
Thanks,
Original comment by [email protected]
on 20 May 2012 at 12:11
from mosh-scheme.
Hi, thank you for your patch.
Your approach sounds good to me.
Two requests and one question.
(1) Can you rename them for more intention revealing?
setMark() -> setPosition()
getMark() -> getPosition()
addMark() -> forwardPosition()
(2) Please make sure it passes all port related tests.
(3) Do you use github? If so, can you send a pull request?
If not, just attach new patch here.
Cheers
Original comment by [email protected]
on 21 May 2012 at 11:42
from mosh-scheme.
Hi Higepon,
Sorry, I don't have Github account and any intention to make it right now.
For (1)
The reason why I named those methods is there were already the same name
methods there and it's used for the real purpose, so I didn't want to mess up
the working code. (Plus, I don't have much time to spend for it either, sorry.)
For (2)
On my main environment (Cygwin), I even could not pass the all tests from the
beginning, so I have only checked until it could continue and input-port.scm,
output-port.scm and input-output-port.scm these passed all tests.
I have also noticed that Transcoder must not have the buffer property. It
causes re-using a transcoder problem as well. It might be good to resolve the
problem with this issue. I think TextualPort can have the buffer instead of
letting Transcoder have it.
I'm looking forward to the next release.
Thanks!
Original comment by [email protected]
on 21 May 2012 at 7:05
from mosh-scheme.
Related Issues (20)
- equal-hash won't terminate if its arguments contain cycles HOT 3
- Hashtable does not store bytevector as its key HOT 2
- segfault on SIGPIPE HOT 2
- get-bytevector-all does not work on pipes HOT 4
- FFI calls with char* are not 8-bit clean HOT 1
- Custom binary output port does not work properly HOT 3
- Applying more than 10000 arguments causes SEGV HOT 1
- missing srfi 25 (with attachment) HOT 2
- NMosh raises an error with record constructor HOT 2
- Passing a list to vector-length causes SEGV HOT 1
- bytevector-s32-set! raises range error HOT 1
- SRFI 41 library doesn't export required variable
- sys-get-bytevector causes SEGV HOT 1
- NMosh does not raise an error with included expression HOT 1
- Example of (mosh concurrent) raises an error HOT 1
- Input/output port doesn't forward position after reading some of its content
- transcoded-port with (standard-output-port) doesn't print on stdout
- Custom binary port doesn't respect returned count
- get-bytevector-n! raises an error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mosh-scheme.