Git Product home page Git Product logo

email-reply-parser's Introduction

Email Reply Parser for Python

A port of GitHub's Email Reply Parser library, by the fine folks at Zapier.

Summary

Email Reply Parser makes it easy to grab only the last reply to an on-going email thread.

Say you'd like to parse out a user's response to your transaction email messages:

Yes that is fine, I will email you in the morning.

On Fri, Nov 16, 2012 at 1:48 PM, Zapier <[email protected]> wrote:

> Our support team just commented on your open Ticket:
> "Hi Royce, can we chat in the morning about your question?"

Email clients handle reply formatting differently, making parsing a pain. We include tests for many cases. The parsed email:

Yes that is fine, I will email you in the morning.

Build Status

Installation

Using pip, use command:

pip install email_reply_parser

Tutorial

How to parse an email message

Step 1: Import email reply parser package

from email_reply_parser import EmailReplyParser

Step 2: Provide email message as type String

EmailReplyParser.read(email_message)

How to only retrieve the reply message

Step 1: Import email reply parser package

from email_reply_parser import EmailReplyParser

Step 2: Provide email message as type string using parse_reply class method.

EmailReplyParser.parse_reply(email_message)

email-reply-parser's People

Contributors

alex avatar alexei avatar bryanhelmig avatar bryanpearson avatar dcwatson avatar kageurufu avatar mattrobenolt avatar mikeknoop avatar roycehaynes avatar wedamija avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

email-reply-parser's Issues

Replies without a delimiter

When Outlook replies to a reply from Mail.app (for example), there may not be a delimiter between the reply and the From/To/Subject headers from the last reply. For instance:

And another reply!

From: Dan Watson [mailto:[email protected]]
Sent: Monday, November 26, 2012 10:48 AM
To: Watson, Dan
Subject: Re: New Issue

A reply

--
Sent from my iPhone

On Nov 26, 2012, at 10:27 AM, "Watson, Dan" <[email protected]> wrote:
This is a message.
With a second line.

I would expect the last reply to be just "And another reply!", but instead it includes the previous headers plus the "A reply" line. I don't really have a good solution (aside from maybe checking for a "From" header line), and don't fully understand the code, just thought I'd mention it.

Support getting quotation

quotes = []
for f in r.fragments:
  if f.quoted:
    quotes.append(f.content)

print '\n'.join(quotes)

something like EmailMessage.get_quotations()

When there is a very long from email does not recognize first line of reply

for example when I send this email from an email like devteam+bf2b64b6e34b11e2aeda1231400178dd@mailit.ciudadanointeligente.org in the first line of the response it adds a newline character between the < and the beginning of the from email. The email ends up being like the following

Estimado Felipe:

La opinión de nuestra campaña es que todos deben ser felices y que dios no
existe.

Muchas gracias por tu interés en nuestra campaña.

Saludos


On 2 July 2013 15:21, <
devteam+bf2b64b6e34b11e2aeda1231400178dd@mailit.ciudadanointeligente.org>wrote:

And currently it is not recognizing that this line should be removed

HTML E.Mails not working

Hi,

can someone confirm that html based emails not working for this method call?
EmailReplyParser.parse_reply(email_message)

It is not extracting any content for me

Package distribution

I gather this package is no longer maintained by it's original authors. I think every user would benefit if either Zapier transferred ownership to another developer or one of the contributors (e.g. @DisruptiveLabs) renamed their fork and published it on PyPI.

Thoughts?

Multiline quote regex

Currently, there is no constraint between On and wrote: _MULTI_QUOTE_HDR_REGEX = r'(?!On.*On\s.+?wrote:)(On\s(.+?)wrote:)'.

So, even if there is a million characters between On ... blah blah blah ... wrote: it still detects as a multiline quote.

I think the gap between On and wrote: needs to be controlled.

I discovered this bug when the library took 1 minute to parse a long email with multiple On and wrote.

I'm creating a PR for this.

Parsing html

Hey there! we've been using email-reply-parser in our project and we got to a point in which we need to parse the html part of the email, how can I do that?

Issue withSignature Regex

There appears to be an issue with the original regex defined for matching signatures:
(--|__|-\w)|(^Sent from my (\w+\s*){1,3})

The above correctly matches the phrase: 'Sent from my iPhone' etc.

But the first group also matches any line containing '--', '__' or '-'. So you get matches for any line containing a hyphenated word for instance:
'This is an auto-reply test to see if I receive an automated message'

I've tested this using the online tool at https://regex101.com/ and also using the built-in .Net regular expression matcher.

The fix appears to be to ensure the first group only looks for matches that start with the defined characters:
(^--|^__|^-\w)|(^Sent from my (\w+\s*){1,3})

Email Reply parser

Hi! We are trying to use the EmailReplyParser in our project. We are now testing it. We ran EmailReplyParser.parse_reply(message)
And we noticed that in parsing this message https://gist.github.com/teeranan/46ccdd3f8dacbc061bde. The parser removed all the lines after the line "-------------- next part --------------". Is this part not considered as a reply in your parser?

Is this package dead?

Hi,

is this package dead? Is there a forn or other solution which is up to date?

Regards

Test failling

Hi

I am trying to package pyhton-email_reply_parser in Fedora repos but %{__python2} setup.py test and %{__python2} setup.py test fails :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.