Comments (22)
And in that case then maybe it makes sense to keep the internal representation LN terminated, because it keeps things a bit simpler and easier to work with/less accident prone.
That's fine. Also a smaller patch = less risk.
Sorry this is taking so long, it's unfortunate that we got zero warning about the problem, but I'd rather take a bit more time to get the fix as right as possible.
There is no need to be sorry and there is no need to rush! Thanks a lot for your work.
from chasquid.
I went through the tests once more. They are suitable to cover the tricky cases. I especially appreciate the fact that there are integration tests which exercise a combination of error states (message too large plus invalid newline). Good thinking here.
Thanks a lot @albertito for the quick and at the same time thorough update.
from chasquid.
Oh! Same time different outcome. I have to confess, that my observations are based purely on looking at docs and issues, not at the code itself.
from chasquid.
Looks like most of the parsing is done by
net/txtproto
Dotreader. I only found one relevant issue in the go tracker: golang/go#9781 It was closed with the following remark:it looks like we're following the strict rules from RFC 5321. I'm going to close this without making any changes. The documentation seems accurate too.
If this is still the case, then
DotReader
should be safe.
Thanks for finding that!
I think the messages and examples from the issue are not consistent with the message closing it: one person is saying "RFC says this behaviour is invalid, and here are some examples", and then it is closed with "we are compliant to the RFC so closing" (obviously paraphrasing both).
So I think the implementations are too lax and don't enforce the "\r or \n must only appear together", and that original issue was closed incorrectly.
I also did some preliminary integration tests and they seem to confirm this too.
But I will look a bit more in depth after lunch, it is very possible I'm missing something that makes this okay.
from chasquid.
Thanks for tackling this so quick! I'm glad to be using your software :)
from chasquid.
It keeps an LF-terminated internal representation, so the patch is less intrusive, and there's less chances of internal problems or mixed termination styles. It enforces CRLF-terminated lines on external endpoints: SMTP courier already had them, this patch adds it to the MDA courier.
Comparing the next
branch with crlf-everywhere
it is quite obvious that keeping the internal representation LF
-only is less risky. If that internal representation should be changed in the future, then it likely would make sense to introduce a new data type for guaranteed-to-be-CRLF
-terminated-text and use that throughout the project. That way inconsistencies could be detected during compile time.
I went through the patch in next
again and left some comments. I'm neither a professional in email protocols nor in golang though, so judge them with a grain of salt.
from chasquid.
Everything looks fine to me!
The following is not critical at all and I didn't see any bugs in the code, just thought I'd write it down as "might be nice for the future".
Generally errors could be handled more idiomatically, which would help be more refactor and error proof.
Here's a couple specific points:
- Use
errors.Is(err, x)
instead oferr == x
so that introducing a new abstraction that wraps errors doesn't break those checks - When using
fmt.Errorf
to wrap an error,%w
should be preferred to%v
as it preserves the actual error and not just a string. That wayerrors.Is/As/Unwrap
continue working.
I know linters can catch this, so maybe worth looking into (golangci is what most use, maybego vet
is enough). - Avoid returning a special value, such as
code < 0
in this patch, as it's easy to miss. Having an expliciterror
return value instead forces the caller to check it. Using anerror
also enables returning data.
Example for thecode
case again in pseudo-go:type response struct{ Code int; Msg string } type responseError struct{ e error; r response } // Would need methods for Is/As/Unwrap fn (c *Conn) Handle() { r, err := c.DATA() if err != nil { var rErr *responseError if errors.As(err, &rErr) { c.writeResponse(rErr.r) } return err } c.writeResponse(r) } fn (c *Conn) DATA() (response, error) { data, err := read() if err != nil { return responseError{err, response{521, "..."}} } return response{250, "..."} }
from chasquid.
Everything looks fine to me!
The following is not critical at all and I didn't see any bugs in the code, just thought I'd write it down as "might be nice for the future". Generally errors could be handled more idiomatically, which would help be more refactor and error proof. Here's a couple specific points:
Thanks for taking the time to think about this and write it down!
- Use
errors.Is(err, x)
instead oferr == x
so that introducing a new abstraction that wraps errors doesn't break those checks
Thanks, this is a good point. A lot of the chasquid code predates errors.Is and there are many cases where it can be used, and doing a review pass is a great idea.
I think sometimes in tightly coupled code, ==
can be more practical (since it indicates "this is a specific error"), but that is a very narrow situation and in general I agree with you.
In this particular patch, the error returned could include e.g. the line number and the type of error so it can be added to the trace by the DATA function.
I didn't want to do this right now as it would add more complexity to this patch that's already fairly big and sensitive, but it's definitely something to add later.
- When using
fmt.Errorf
to wrap an error,%w
should be preferred to%v
as it preserves the actual error and not just a string. That wayerrors.Is/As/Unwrap
continue working.
I know linters can catch this, so maybe worth looking into (golangci is what most use, maybego vet
is enough).
This is just an artifact of the code predating %w
, and I agree with you. It's something to clean up/bring more up to date alongside the above.
- Avoid returning a special value, such as
code < 0
in this patch, as it's easy to miss. Having an expliciterror
return value instead forces the caller to check it. Using anerror
also enables returning data.
Totally! This code < 0
thing is a big hack!
Until this point, I thought returning the (code, msg)
pair was practical and clear enough, and a custom type wouldn't have added much value.
But now that we want to break the connection this way, I think it definitely merits a custom type in the style you suggest.
I don't want to add that now because it means again a fair amount of complexity on an already intrusive change. If we had time, I would do a patch introducing that type first, and then use it in this one
But this issue is fairly pressing so I rather do the little hack, and then clean it up with more time afterwards.
I have added a couple of TODOs based on your suggestions, so I don't forget :)
Thanks again!
from chasquid.
Looks like most of the parsing is done by net/txtproto
Dotreader. I only found one relevant issue in the go tracker: golang/go#9781 It was closed with the following remark:
it looks like we're following the strict rules from RFC 5321. I'm going to close this without making any changes. The documentation seems accurate too.
If this is still the case, then DotReader
should be safe.
from chasquid.
I think chasquid is vulnerable to the attack because neither net/textproto.Reader.DotReader nor net/mail.ReadMessage enforce the "LR and CF must only appear together" rule.
Some (not polished and not comprehensive) examples: https://go.dev/play/p/pZNkk-FMwp-
This is based on a quick initial analysis; it is possible I'm missing something. I will continue looking at this today.
from chasquid.
Just to keep folks updated: I've written a bunch of test cases to confirm the issue, and now I'm working on a wrapper that enforces strict \r\n
in messages, as per the RFC.
That way we can continue to use the standard library for parsing dot-terminated sections and email messages, but still be strict about newlines in them.
While I don't necessarily love the approach, it is practical, and I hope it is robust enough for now.
I'll post another update once I have something more concrete.
from chasquid.
Commit 606c392 (in the next
branch) has a preliminary fix.
It still needs more testing, which I will do tomorrow.
from chasquid.
Thanks a lot @albertito, commit 606c392 looks promising. While reading through the new readUntilDot
func, I noticed the following:
- The new
readUntilDot
func is much simpler than the stdlibDotReader.Read
func. I definitely have less trouble following the code and the state flow. - The new
readUntilDot
func doesn't consume CR characters, it returns any CR-LF sequence unmodified. The stdlibDotReader.Read
func does consume CR characters. It only returns LF line endings. - The new
readUntilDot
func does consume a leading period (dot stuffing). Same with the stdlibDotReader.Read
func.
If observation 2. is correct, then the internal representation of a message body changes with the fix. Will that have an impact on post-data
hook implementations? Will it have an impact on mda-lmtp
?
from chasquid.
If observation 2. is correct, then the internal representation of a message body changes with the fix. Will that have an impact on post-data hook implementations? Will it have an impact on mda-lmtp?
I assume that the tools invoked from the post-data
hook will expect to be operating on the body of a mime message. RFC 2046 section 4.1.1 specifies that:
The canonical form of any MIME "text" subtype MUST always represent a line break as a CRLF sequence.
Passing the email body to post-data
with CRLF line endings is actually better aligned with standards.
from chasquid.
If observation 2. is correct, then the internal representation of a message body changes with the fix. Will that have an impact on post-data hook implementations? Will it have an impact on mda-lmtp?
mda-lmtp
uses stdlib net/texproto
DotWriter. That caters for content in both forms (NL
as well as CRNL
). So that should be good as well.
from chasquid.
Hm, I think addReceivedHeader and envelope.AddHeader might need some attention as well. It looks like this funcs are adding NL
terminated headers instead of CRNL
terminated. If that is true, then chasquid
instances will have trouble talking to each other after the fix (and to postfix
as well as soon as they roll out smtpd_forbid_bare_newline ). This is because the receiver will drop the connection as soon as it encounters a solitary NL
.
from chasquid.
Thanks a lot @albertito, commit 606c392 looks promising. While reading through the new
readUntilDot
func, I noticed the following:
- The new
readUntilDot
func is much simpler than the stdlibDotReader.Read
func. I definitely have less trouble following the code and the state flow.
Thank you! Some parts are a bit hacky, and I think it's likely that it can be simplified/made more readable, but that can be iterated on later.
- The new
readUntilDot
func doesn't consume CR characters, it returns any CR-LF sequence unmodified. The stdlibDotReader.Read
func does consume CR characters. It only returns LF line endings.
If observation 2. is correct, then the internal representation of a message body changes with the fix. Will that have an impact onpost-data
hook implementations? Will it have an impact onmda-lmtp
?
This is correct. I thought about keeping the old behaviour, but then if other tools (like MDAs) get more strict in the future, then it's likely we will need to end up doing it anyway.
So my current thinking is to just bite the bullet and update it.
The impact is smaller than it may seem, because both SMTP courier and mda-lmtp already normalize newlines to \r\n, so there shouldn't be externally observable behaviour differences.
Hooks, and other MDAs, will observe a difference in newlines. While the new behaviour is more compliant (as you noted in the other comment), it is still a significant change.
- The new
readUntilDot
func does consume a leading period (dot stuffing). Same with the stdlibDotReader.Read
func.
Yeah, that is important for preserving the message structure. While this behaviour was tested, the test was a bit on the sides, so now I've added more explicit end-to-end tests for the dot stuffing just in case.
Hm, I think addReceivedHeader and envelope.AddHeader might need some attention as well. It looks like this funcs are adding
NL
terminated headers instead ofCRNL
terminated.
This is true, and also DSNs are internally \n-terminated which now creates some inconsistencies. I am working on clearing these cases too.
If that is true, then
chasquid
instances will have trouble talking to each other after the fix (and topostfix
as well as soon as they roll out smtpd_forbid_bare_newline ). This is because the receiver will drop the connection as soon as it encounters a solitaryNL
.
That's not the case: even if we internally still have some \n-terminated lines, the SMTP courier will normalize the newlines; because of that, there are no issues communicating with other strict servers (chasquid or postfix).
This is well covered by the integration tests (which have chasquid talk to each other).
I will post an updated patch that improves testing, although there are no significant code changes from the previous one.
My next step is to clear up the known internal \n-terminated lines just to reduce potential friction/confusion. I will look at the hook and the local MDA courier to see if it's worth doing the newline normalization there too, although I am not sure it will be needed.
Thank you!
from chasquid.
My next step is to clear up the known internal \n-terminated lines just to reduce potential friction/confusion. I will look at the hook and the local MDA courier to see if it's worth doing the newline normalization there too, although I am not sure it will be needed.
FYI, I've done a pass of "internal representation uses CRLN terminated lines" (fixing up DSN generation and those headers), and it works fine.
However, after thinking about this for a bit more, I think doing a CRLN normalization at endpoints with external systems (SMTP courier (preexisting) and MDA courier (missing)) is still the right call.
And in that case then maybe it makes sense to keep the internal representation LN terminated, because it keeps things a bit simpler and easier to work with/less accident prone.
I'm working on an alternative set of patches so we can compare the two approaches. I will post references once I have something in reasonable shape.
Sorry this is taking so long, it's unfortunate that we got zero warning about the problem, but I'd rather take a bit more time to get the fix as right as possible.
from chasquid.
The next
branch has the latest iteration of the fixes.
It keeps an LF-terminated internal representation, so the patch is less intrusive, and there's less chances of internal problems or mixed termination styles. It enforces CRLF-terminated lines on external endpoints: SMTP courier already had them, this patch adds it to the MDA courier.
Doing the enforcement at endpoints should help prevent any future issues arising from potentially inconsistent internal newlines.
Hooks continue to get LF-terminated messages, so there are less chances of regressions in preexisting deployments. Also output from hooks (which are usually LF-terminated) continues to be well handled.
I'm going to run this version in my development server for a bit, do some external-world tests; and report back.
For comparison, I uploaded a new branch crlf-everywhere
which does the internal conversion to CRLF, and skips the enforcement in the MDA courier. I'll eventually remove it, but it can be useful if anyone wants to compare the two approaches and have an opinion about it.
Thank you!
from chasquid.
Thanks @znerol and @ThinkChaos for your comments and thorough review!
I've responded to the comments and uploaded an amended patch 34d52ca.
I really appreciate the review, so if you have a chance, please take another look and we can keep iterating on this.
Thank you!
from chasquid.
Thanks again for the detailed and thorough reviews!
Things look stable, and I haven't identified any issue in my (unfortunately short) tests in the wild, so I am preparing a new release.
from chasquid.
chasquid v1.13 has been released with the fix.
Thanks again for all the help, discussions and reviews through fixing this!
from chasquid.
Related Issues (20)
- t-11-dovecot fails due to permission errors HOT 2
- docker: setfacl step fails when using user-provided certificates HOT 4
- Send-only server without dovecot? HOT 3
- SpamAssassin integration in Ubuntu 16.04 needs adjustment HOT 5
- Send-only server: `Destination address is unknown` when sending to local domain HOT 2
- Do not `chown` files (unless the new file has a different UID/GID) HOT 1
- [smtp-check]: Some MTA do reject client connections unless the local name looks like an fqdn HOT 2
- Extend how-to guide to include how to actually send email HOT 5
- Surprising interaction of drop_characters with aliases HOT 8
- Support domain users with no valid password, for receive-only MTAs HOT 1
- Using an empty listening address in the config results in chasquid listening on a random port HOT 4
- Add a document for client configuration HOT 1
- Behavior of aliases pointing to non-existent local addresses HOT 2
- mail to/from IP address, or just document how HOT 5
- No Docker images for 1.11+ HOT 12
- Wording of SMTP error messages HOT 4
- FR: add pre-DATA (post-RCPT-TO) hooks HOT 1
- Send-only accounts - errors reading users file HOT 4
- No cipher overlap between server and client HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chasquid.