Git Product home page Git Product logo

textfilter's Introduction

很短但是觉得挺有用的东东
所以单独立了个项目备份一下

USAGE:

    >>> f = DFAFilter()
    >>> f.add("sexy")
    >>> f.filter("hello sexy baby")
    hello **** baby

textfilter's People

Contributors

observerss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textfilter's Issues

python3编辑代码的报警

1、pirnt -> pirnt()
2、unicode的报警
py3 的字符串与 py2 的区别说穿了就是很简单的对三种数据类型的处理。py2 的方式意味着字符串跟字节流是相同的东西。而unicode字符串是某种独特的类型。bytes==strunicode!=strpy3 的方式意味着字符串跟unicode字符串是相同的东西,而字节流是某种独特的类型。unicode==strbytes!=strunicode是什么呢?是某种特定编码的字节流,是bytes的子集。这就意味着:所有的unicode都能放进bytes,但某些bytes无法放进unicode。C 程序员最难接受的就是无法将一串字节流放进字

Python3里str是unicode,所有需要和人交互的地方都应该用str

作者:pansz
链接:https://www.zhihu.com/question/60231684/answer/173871080
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

数字过滤的时候有问题

gfw = DFAFilter()
gfw.parse("keywords2") #keywrods2 包含敏感词:1989年
print gfw.filter("1989","*")

过滤后的结果:
989

找不到Module

运行以下代码的时候,
from filter import DFAFilter
显示:
Traceback (most recent call last):
File "", line 1, in 0
ImportError: No module named filter

如果在terminal运行
pip install filter
又说
Could not find a version that satisfies the requirement filter (from versions: )
No matching distribution found for filter

求告诉运行方法,谢谢!

中文词表加载问题

对中文支持不好,python3对utf8支持很好了,建议修改,open(filename,'r',encoding='utf8')

修改了NaiveFilter的两个function,老的不适用了

Parse function

    def parse(self, path):
        with open(path, 'rb') as f:
            self.keywords = [x.decode('utf8').strip() for x in f.readlines()]

Filter function

    def filter(self, message, repl="*"):
        for kw in self.keywords:
            message = message.replace(kw, len(kw)*repl)
        return message

之前的版本是2.7的我运行不了,我自己根据python3.7改了下NaiveFilter的function。(我也把敏感词那个文件改成txt文件了)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.