Git Product home page Git Product logo

ips-comment-bot's People

Contributors

btibs avatar caldeirag avatar isaiahzs avatar scohe001 avatar thesecretmaster avatar z-aki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ips-comment-bot's Issues

Have a new detection type for comments on old posts

As requested by Catija.

It would be useful if comments on posts that are over a certain age get detected, like the regexes and magic comment. This would probably be post useful as a new type of detection - old_post or something like that - that can be adjusted for posting into child rooms as needed.

It'd also be good if the specific age of the question was able to be adjusted in the config. For a starter, though, two weeks is probably old enough to start detecting.

Add validation around adding a new regex

People often forget to specify a reason (which will create a null reason) or accidentally capitalize letters in their regex (resulting in a regex that'll never match anything).

Add some validation around Add that will:

  1. Disallow adding a regex without a valid reason
    a. Where a "valid reason" is any reason with 1 or more characters
  2. Run a weaker version of howgood on the new regex. If it matches less than 10 existing comments, print a warning after the "Added" message that says something like: "Note: this regex only matches X (Y% of total) comments in the database. Please ensure you're adding a useful regex." (or maybe something a little less harsh than that on the second sentence)

Proposed `howgood` output change

I propose we change the howgood output to be an ASCII table. This would require changing lines 198-206 of comment_scan.rb to:

tp_msg = [ #Generate tp line
  "#{'tp'.center(6)}",
  "#{tps.round(0).to_s.center(11)}",
  "#{((tps*100/total).round(8).to_s+"%").center(14)}",
  "#{((tps*100/Comment.where("tps >= ?", 1).count).round(8).to_s+"%").center(15)}",
  "#{((tps*100/Comment.count).round(8).to_s+"%").center(18)}",
].join('|')

fp_msg = [ #Generate fp line
  'fp'.center(6),
  fps.round(0).to_s.center(11),
  ((fps*100/total).round(8).to_s+"%").center(14),
  ((fps*100/Comment.where("fps >= ?", 1).count).round(8).to_s+"%").center(15),
  ((fps*100/Comment.count).round(8).to_s+"%").center(18),
].join('|')

total_msg = [ #Generate total line
  'Total'.center(6),
  total.round(0).to_s.center(11),
  '-'.center(14),
  '-'.center(15),
  ((total*100/159229).round(8).to_s+"%").center(18),
  #(total*100/Comment.count).center(18),
].join('|')

#Generate header line
header = " Type | # Matched | % of Matched | % of all type | % of ALL comments"

final_output = [ #Add 4 spaces for formatting and newlines
  header, '-'*68, tp_msg, fp_msg, total_msg
].join("\n    ")
say "    #{final_output}"

#Could also be this depending on programming preference
#puts "    #{header}\n    #{'-'*68}\n    #{tp_msg}\n    #{fp_msg}\n    #{total_msg}"

I would fork and push but I have no way of testing the bot after I've made the changes. So for fear of breaking things, I'm posting my suggestion here instead :)

Add some gamification to encourage more feedbacks

Normally, I'd just implement this, but I'm not sure if this is a direction we want to take the bot. I'm posting this here to get my thoughts down and have something to point TAS users to.

With the (near) completion of #28, we'll have each new feedback (tp/fp/rude) linked to a chat user. I propose we add some soft gamification in TAS to encourage more feedbacks. This plan has two parts:

  1. On feedback (tp/fp/rude) given, check the total number of feedbacks attributed to that user. Print a congratulatory message (instead of the usual message) if they've reached a milestone. ie...
    a. "Congratulations on your first feedback, @user! Now feed me moooaaaare :D"
    b. "Let's all give @user a pat on the back for their 25th feedback!"
    c. "Holy moly did you really just give me your 100th feedback, @user?!"

  2. Create a scoreboard mention command (similar to cats, ie: "@ips, gimme teh scoreboard"). This command will show the top 10 users with the most feedbacks in a table like
    User | XXX Total | YYY tp's | ZZZ fp's

However, I'm not sure if this is something we want to do. I'll run it by TAS now that it's written up and see what they say.

Disable "Invalid Input" response when reply was >1 parameter long

Right now, when the bot gets a response to a comment report that it can't parse (that doesn't begin with i), it prints

Invalid feedback type. Valid feedback types are tp, fp, rude, and wrongo

However, more and more often people are responding to bot messages with human responses (to draw conversation to a reported comment). Adding a prepended i is difficult to remember and looks ugly. Instead, why don't we ignore any reply that is longer than 1 word (since all replies to comment reports are 1 word commands).

This FR was run by TAS and seems to have the room's support (see here).

This will probably just involve adding a check on msg.body.split(' ').length right after line 121 of comment_scan.rb.

Bonus/Easter Egg: have a 1/5 chance of replying with a random comment from the list

  • Ain't that the truth.
  • You're telling me.
  • Yep. That's about the size of it.
  • That's what I've been saying for $(AGE_OF_BOT)!
  • What else is new?
  • For real?
  • Humans, amirite?

Remove a regex reason if the only regex it has is removed

Right now if I add a new regex reason and then remove the regex under it, the reason sticks around like chewing gum under a high school desk--not really doing any damage, but nobody really wants it there if they have a choice.

Can we add a feature to !!/del where if the regex being deleted is the only one under a reason, the reason is deleted too?

Have an option to not detect all comments and just report the regex hits

To make this bot a little more useful for using on other sites that may not want a record of all comments ever posted, there should be an option to run the bot and just have the regex hits posted into chat, such as is happening with The Awkward Silence as of right now, but without having the HQ room.

So, for instance, if I want to detect certain comments on Scifi.SE but not to have all the comments, just the regex hits, this would be useful for that.

Don't duplicate post manually reported comments

When a user manually reports a comment, it gets reposted in every chat room that the bot is in, including the room it was reported from. It can be a little confusing seeing the same comment posted twice, so I propose that the bot not repost manually reported comments in the room they were reported from.

Ignore the OP for regexes, with the exception of offensive

A lot of the regex fp detections I see, especially for the possible-aic ones, come from the OP responding to the comments with more comments.

This could be avoided by excluding the OP from the regex, much like moderators. If you've posted the post that this comment was posted on, you should be excluded from non-abusive regexes.

Add unit tests (that GitHub will auto-trigger when pushing new code)

This is maybe more of a longterm goal, but it'd be really awesome if we could get some unit tests in the project. Bonus points if we could get them to auto-run when someone tries to make a pull request (to make it easy to see if there are build errors/obvious semantic errors that break stuff).

This'd probably require a pretty big refactor to make the core functions callable from a test with test data. We'd have to add a level of abstraction so that we can fake the SE API calls.

I definitely won't be trying this any time soon, but maybe some day...

Add "rescan" reply-to option

A common workflow is to see a bad comment, !!/add some new regex and then rescan the comment to be caught by the regex. This requires finding the comment id and formulating a !!/manscan command.

Instead, it would be easier to reply to a reported comment with "rescan" to have the bot rescan (call scan_comments) the comment. ie:

@IPSCommentBot rescan

Get rid of tp/fp/rude on the Comment table

With the advent of the Feedbacks table, storing tp/fp/rude int's on the Comment is now redundant.

That being said, everything looking at tp/fp/rude numbers is looking at those columns, so this may be a bit of an undertaking.

What will likely need to be done:

  • Remove tp/fp/rude int columns on the Comment table
  • Add some easy way to fetch tp/fp/rude numbers for a column
    • Maybe a view? Or the SQLite equivalent? Or just a function on the Comment class that returns a map?
  • Update all code fetching/updating those values to ensure that it's fetching/updating Feedbacks instead
  • Write a migration script or SQLite query to create Feedback rows from anonymous users (maybe user id -1) for all of the legacy comments before this db table was added

Have a "x comments in time y on post z" alert

What would be useful for identifying problematic comment threads would be a certain amount of comments in a certain timeframe triggering a message from the bot to the chatroom.

For instance, 10 comments in 5 minutes, 20 comments in an hour, 30 in the past day, on the past day.

Such a message would probably look like

Possible argument: 10 comments in 5 minutes on post Title by user.

Posted in both the control and child rooms.

Have the DB viewable from the web

It'd be cool, as well as useful for people who want to look at some data, to have the database that the bot is running somehow be available to people on the web. Would there be some way to have the bot automatically put the database into e.g. a DB reader? I don't exactly know how this works ;)

Delete offensive detections from child rooms after a minute

It's probably best not to leave the offensive content in child rooms for long. (This caused an argument in The Closet a bit back.)

If the bot deleted the messages about the offensive detection a minute (or possibly a minute and a half) after reporting, that's long enough for it to be seen and have action taken on it, but within the time limit for deleting your own chat messages.

Alert HQ room when a user passes a threshold of tp comments

It'd be nice to have a warning when a user has been leaving a lot of delete-worthy comments.

I'm thinking a message in the HQ room every time someone passes multiples of 20 tp comments.

E.g.:

**Alert**: User <ID> has left 40 comments marked tp. (@Mithrandir)

This would happen at 20, 40, 60, 80...

This would be without using the username, just the ID.

Add a way to tell which regex triggered a report

Add the ability to reply to a reported comment to have the bot respond with which regex triggered the comment. Eg:

 > Hi, and welcome to IPS! Could you provide additional detail to the situation? What is the context of your disagreement with your friend? What have you tried so far? โ€” Jess K. 2 mins ago
 #19255 Jess K. | Q: How to apologize to a friend when you know you did something wrong but can't confess it? (score: 0) | posted 5 minutes ago by user10477618 (1 rep) | edited 1 minutes ago by A J (6679 rep) | Toxicity 0.06513069 | tps/fps: 0/0

 > @IPSCommentBot huh?

 > Comment matched "experimental-aic(@scohe001)" for regex: - q: have\Wyou\Wtried

Implement Perspective to find rude comments

Implementing Perspective would help in finding comments that are flaggable but do not directly contain any keywords that would trip the regex.

This is not high-priority at all, but it would be useful to have at some point in the future.

Link feedbacks to people

This would just be a handy thing to have. Also we should maybe add some logging around feedback time, location, and the original regexes that caught it. I may go in and do this myself when I get time.

Link to users should just be /u/###, not /u/###/name

Currently, Smelly posts links to users in the format https://interpersonal.stackexchange.com/u/31/arwen-und%c3%b3miel. However, this doesn't work - this link format doesn't work. It should just be https://interpersonal.stackexchange.com/u/31, as SE doesn't support /u/<id>/<name> format links.

Implement a feedback system

As requested by M.A.R..

Currently, there's no way to give feedback on a comment detected by regexes and so no way to evaluate the effectiveness of a singe pattern.

Implementing a feedback system that would allow feedback on the individual regexes tripping would be useful for refining the detections.

Move handling of offensive words out of user defined regular expressions and into code

We currently have several regular expressions under the category of "offensive". These regular expressions get printed whenever someone uses the !!/regexes command in chat. This means that the bot is routinely posting messages that contain content we've deemed offensive. In order to not keep putting this into chat, where someone might come across it, we should have these cases handled within the bot's code where they won't be constantly printed in chat. This will make it slightly harder to add new offensive words to the list of things we catch, but should be fairly maintainable moving forward.

Exclude moderators from the regex

Excluding moderators from being posted to the child room when they trip they regex would be useful and lead to less clutter in the child room.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.