Git Product home page Git Product logo

spamscanner / spamscanner Goto Github PK

View Code? Open in Web Editor NEW
268.0 9.0 35.0 11.37 MB

Spam Scanner is a Node.js anti-spam, email filtering, and phishing prevention tool and service. Built for @ladjs, @forwardemail, @cabinjs, @breejs, and @lassjs.

Home Page: https://spamscanner.net

License: Other

JavaScript 98.21% HTML 1.59% Shell 0.20%
spam scanner anti-spam service api javascript node spam-filtering spam-detection spam-protection

spamscanner's Issues

v2.0.0

  • When we're parsing tokens, striptags implementation needs to additionally be pre-processed with sanitize-html to remove blocks like <style>, <stylesheet>, <meta>, <head> etc.
  • Modify scanner.getPhishingResults to check against OpenPhish and PhishTank datasets.
  • Tokenize and stem other mail headers (e.g. to, from, cc, bcc, reply-to, in-reply-to, etc.)
  • Determine solution to performance issue with classifier.train() in classifier.js per NaturalNode/natural#520.
  • Headers should NOT get converted and preserved for URL/Received-By purposes - only content should be converted
  • Get inspiration from ls /usr/share/spamassassin if needed

How to train Naive Bayes Classifier ?

Is there more information on how to train the classifier?

I see in the source classifier.json is currently private, which explains the broken links on the site.

The source indicates removing classifier.json, should be all that is needed to train and set SPAM_CATEGORY and SCAN_DIRECTOR. Is that all then feed a directory of spam or ham in EML or ARF format?

False malicious classification of hostname

My host parabellum.ga has been classified as malicious

The response from the remote server was:
420 Error for [email protected] of "Link hostname of "parabellum.ga" was detected by Cloudflare's Family DNS to contain adult-related content, phishing, and/or malware. Phishing whitelist requests can be filed at https://github.com/spamscanner/spamscanner/issues." If you need help please forward this email to [email protected] or visit https://forwardemail.net.

Not detecting anything

I've fed thousands of emails to scanner.scan and not even one of them was detected as spam. I've looked at some of the eml files and like 30% of them are definitely the type of spam any spam scanner should be able to detect, so what's the deal? Am I missing a step somewhere? Do I need to provide it with spam to learn, if so what function do I call?

Request to remove about.chatroulette.com

I got this error response when linking the site https://about.chatroulette.com/solutions/. I know https://chatroulette.com/ is categorized as adult content (hey, working on changing this!) but is there any chance subdomains are not detected as adult-related content?

420 Error for [email protected] of "Link hostname of "about.chatroulette.com" was detected by Cloudflare's Family DNS to contain adult-related content, phishing, and/or malware. Phishing whitelist requests can be filed at https://github.com/spamscanner/spamscanner/issues." If you need help please forward this email to [email protected] or visit https://forwardemail.net.

ClamAV on windows?

I want to use this on a windows host, but there are no instructions on how to set up clamav for this on windows.

white list renti.co

Link hostname of "renti.co" was detected by Cloudflare's Family DNS to contain adult-related content, phishing, and/or malware.

renti.co is safe. It's a property rental application system.

please whitelist renti.co

Hostname is classified incorrectly.

Hi all,

I'm using forwardemail.net service for www.pricedout.org.uk domain, and occasionally when sending emails from this domain I get the following error:

420 Error for [email protected] of "Link hostname of "pricedout.org.uk" was detected by Cloudflare's Family DNS to contain adult-related content, phishing, and/or malware. Phishing whitelist requests can be filed at https://github.com/spamscanner/spamscanner/issues." If you need help please forward this email to [email protected] or visit https://forwardemail.net.

AFAIK, the emails usually go through just fine and the misclassifications seems to happen intermittently.

Would it be possible to somehow whitelist this domain?

Many thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.