1712n / dn-institute Goto Github PK

Distributed Networks Institute

License: The Unlicense

Dockerfile 1.36% SCSS 9.81% HTML 18.38% JavaScript 8.89% Python 58.74% TypeScript 2.82%

ai attack blockchain bounty challenge crypto-attacks distributed fraud-detection grant llm market-manipulation nlp prize prompt-engineering research security wiki

dn-institute's People

Contributors

Stargazers

Watchers

dn-institute's Issues

Images UX improvement

Right now images in the posts are not presented in the best way.

Problems to fix:

sometimes post contains multiple consecutive images, like here. Right now they are stacked one after the other taking too much space and not looking good.
they are not clickable and therefore hard to read (too small)

Improvements:

Ideally, we would want to have a gallery view in the cases of several images stacked, so they would look like in Instagram, and users could horizontally swipe through them.
enable click and open for images so they can be opened and enlarged after click

To Do:

Find Hugo native image processing solution, avoid JS addons

MVT content

https://docs.google.com/document/d/12rsfyvz4Cl-bS7_rWCEx5K0DBplqEzJ3g03Ul4WeMt8/edit

Contribute to Crypto Attacks Wiki

This challenge aims to capture a wide range of contributions to the Crypto Attacks Wiki. To participate, submit a pull request that adds or modifies files in the attacks directory and request a review from this issue assignees. All submissions will be reviewed by the wiki maintainers, and additional changes to your pull request may be asked of you to bring your submission to the quality level of the rest of the wiki.

Submission ideas

New pages
Page placeholders with metadata
Additions to existing pages
Meaningful edits to existing content that fix typos, grammar, factual and stylistic errors, etc.

Submission Guidelines

Before committing to the wiki, please ensure your submission meets the following criteria:

The attack is not already covered by existing posts and pending PRs
File name - YYYY-MM-DD-entity-that-was-hacked.md
Headers:

Header name	Required	Description	Example
`date`	yes	YYYY-MM-DD	2012-07-16
`target-entities`	yes	Entities that were targeted by the attackers. Multiple values allowed	`Binance`, `Localbitcoins`, `Ethereum`
`entity-types`	yes	General category describing targeted entity. Check existing ones in the examples and suggest yours if not present. Multiple values allowed	`Custodian`, `DeFi`, `GameFi`, `Exchange`, `Wallet`, `Blockchain`, `Bridge`, `Yield Aggregator`, `Lending Platform`, `Stablecoin`, `Token`, `NFT`
`attack-types`	yes	Common hacking technique, check existing ones in the examples and suggest yours if not present. Multiple values allowed	`51%`, `Wallet Hack`,`Private Key Leak`, `Infrastructure Attack`, `Smart Contract Exploit`, `Flash Loan Attack`, `Phishing`, `Signature Verification Issue`, `Brute Force`, `Race Condition Exploit`
`title`	yes	Article Title	`BitGrail Hack Results in $170 Million Loss`
`loss`	yes	Loss (In approximate USD equivalent at time of incident)	50000

Focus on facts and numbers instead of vague phrases and value judgments (such as "huge losses", "important lesson"). Facts mostly include named entities (people, companies, places, addresses, etc.) Simply repeating what the attacked entity had to say is not enough. Try finding messages from those who spotted anomalies before any official announcements, 3rd party audits, statements from other entities, sources of structured data that show the impact of the attack on prices, volumes, hashrates, etc.
Add markdown links directly to your text - help our fact-checking bot to verify claims found in your article.
The timeline should use bullet points with dates; no significant events should be missing
Default to bullet point structure with titles - this helps to keep the content concise and focused, and is essential for future attack modeling
Only standard sections are allowed. The attack wiki requires the following sections:
- Summary
- Attackers (focus on the attackers, not what they did)
- Losses
- Timeline
- Security Failure Causes

If the changes requested by reviewers are not addressed within a week, the PR will be considered stale and will be closed.

Fix fact check comment

Improve QA bots with GH Actions

Our crowdsourced Crypto Attack Wiki has a number of bots in charge of automated quality checks. Things have changed since we first implemented the quality checks, and now most of the functionality is covered by GPT-like services. It should be possible to refactor existing quality checks to fit the whole pull request (instead of splitting it into chunks) into one API request. Anything that goes beyond LLM's context window, can safely be cut out and produce a warning to submit smaller PRs. Although OpenAI API still lacks web browsing functionality, there are alternative solutions offered by Goggle or Azure. Reach out if you need test API keys.

To participate in the challenge, submit a pull request that refactors existing bots into one GPT-like API request and request a review from this issue assignees. Below are some improvement ideas, but feel free to suggest other functionality.

Tasks

Beta Give feedback

Fact-checking. Azure OpenAI Service and reverse-engineered Bard libraries have that functionality, but there are plenty of alternative solutions.
Spellcheck can safely be moved to GPT with a carefully crafted prompt.
Hugo SSG formatting checker. The fact that an article is a Markdown document with headers for Hugo SSG would probably need to be specified in the prompt anyway for adding the spellcheck functionality. Asking the LLM to check formatting should be an easy addition.
Existing articles' check. Getting the directory listing is the easiest way to do it, and you already implemented it.
Article submission guidelines checks. A good idea might be adding a markdown file with guidelines to load guidelines from there instead of hard-coding them in the prompt.
Plagiarism. Probably requires additional testing of existing LLM services, but there's nothing fundamental that would prevent models with internet access to spot copy-paste work.
Options

All submissions will be reviewed by the wiki maintainers, and additional changes to your pull request may be asked of you to bring your submission to improve code quality, security, and/or efficiency.

How to Claim Bounty

Email your bitcoin or stablecoin payment address to [email protected]. Expect all payouts to be completed by the end of the month.

🎉 @Kseymur manged to figure out most of the needed improvements using Claude and Brave API.

Add first Market Health article

add article-name.md file with the article text to content/market-health
add images to content/market-health/img/article-name
name images properly and insert them into .md file
create PR
assign @sofiasedlova to review

Approval standards

Let's put together a list of things to check in every PR before approval. Valuable notes.
I started here. Please add your notes and let's make it public for submitters.

Fix menus

Hide MVT
Make "Crypto Attacks" collapsible
Delete "External"

Add a Market Manipulations Widget

To participate, submit a pull request that adds a simple and lightweight widget for displaying Wash Trading and Front Running Metrics on the Market Health Metrics Documentation page located here, and add this issue assignee as a reviewer. Feel free to use whatever JS/TS libraries and frameworks, but keep in mind that your submissions will be evaluated based on the combination of cost/efficiency, security, and load speed. Below is a rough sketch of the widget position on the page, no need to make it look identical:

This time, we offer multiple bounties for the top 3 submissions:

🥇 $1000
🥈 $500
🥉 $250

Anyone can participate and submit their pull request before the deadline. We will also reach to all challenge participants with successful submissions to talk about more permanent research and development opportunities within our projects. Spread the word - additional $250 💰 will be paid for referrals of any winning submissions!

See more details on the Challenge Program and check out the success stories of the challenge winners.

Looking forward to your submissions!

Update fact-check

add docs to mvt api schemas

restructure categories

What is the difference between Security incidents and Attacks in the main outline? I would put all attack categories under one parent and make it clickable and leading to the full list

Originally posted by @marina-chibizova in #3 (comment)

Proposed menu structure

Crypto Attacks
- List
- Categories (drop-down)
  - 51% Attacks
  - Custodian Attacks

Notes

Menu titles are generated automatically from markdown titles
There are 2 default page types (docs and posts). Mixing them in the menu is flaky.
Right now default categories are used for attack types and tags - for other things like custodian names. Might be better worth creating to create separate fields for cleaner taxonomy. This could help with adding "recent posts" block to category articles later on (example).

Front-Running Detection

Front-running is trading stock or any other financial asset by a broker who has inside knowledge of a future transaction that is about to affect its price substantially. A broker may also front-run based on insider knowledge that their firm is about to issue a buy or sell recommendation to clients that will almost certainly affect the price of an asset.

Investopedia

In the crypto industry, front-running is a common occurrence, with even the biggest trading venues involved. What often starts as an in-house bot to provide additional liquidity to a centralized exchange, sometimes turns in an illegal tool for to earn money by trading ahead of their own customers.

Please comment below with ideas on detecting front-running trades based on a stream of executed trades. More specifically, describe an algorithm you would use to create a metric capable of automatically flagging suspicious trades. Feel free to support your ideas by adding references, datasets, graphs, and code. Comments with the best ideas will be hidden to allow others to participate. Multiple submission awards are available.

Many of the previous challenge participants focused on investigative approaches that involved manual analysis of specific cases of front-running. This, however, is an engineering challenge, requiring successful submissions to include an algorithm, supported by references, datasets, graphs, and/or code. We have more than enough of ingenious ideas on how it can be done, but no solid plans of how to implement it using real-time streaming data.

Fix collapsible

JS collapsible doesn't work on some devices, let's switch to the basic functionality of hugo-native collapsible and see when it's not enough and we need something else

Twitter Home Timeline Scraper

Your task

The goal of this challenge is to create a scraper for Twitter home "for you" timeline. We are interested in capturing relevant to us (according to Twitter recommendation engine) tweets / liked tweets / retweets.

Overview

The scraper will be deployed as a scheduled function that periodically checks timeline updates and sends new tweets to a MongoDB collection. Avoid Twitter's official API, but feel free to leverage their front-end GraphQL API or other reliable ways of obtaining the data. Please keep in mind that the submissions will be evaluated based on the combination of cost/efficiency, security, and stability.

Data Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "timestamp": {
      "type": "number",
      "description": "Unix epoch message timestamp"
    },
    "message": {
      "type": "object",
      "properties": {
        "text": { "type": "string" },
        "id": { "type": "string", "description": "Message ID" },
        "language": {
          "type": "string",
          "description": "Language of the Message"
        },
        "enrichment": {
          "$ref": "./enrichment_fields.json"
        },
        "media": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "type": { "type": "string", "enum": ["image", "audio", "video"] },
              "id": { "type": "string" },
              "title": { "type": "string" },
              "file_name": { "type": "string" },
              "url": {
                "type": "string",
                "description": "URL of the media file"
              }
            }
          }
        }
      },
      "required": ["text", "id"]
    },
    "user": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "id": { "type": "string" }
      },
      "description": "Sender of the Message"
    },
    "geo_coords": {
      "$ref": "https://geojson.org/schema/GeoJSON.json"
    },
    "source": {
      "type": "object",
      "properties": {
        "platform": {
          "type": "string",
          "description": "Telegram/Twitter/Reddit"
        },
        "channel": {
          "type": "object",
          "properties": {
            "name": {
              "type": "string",
              "description": "Name of the message source, such as a Telegram channel"
            },
            "id": { "type": "string" }
          }
        },
        "referenced_post": {
          "type": "object",
          "properties": {
            "url": {
              "type": "string",
              "description": "URL link of the referenced post"
            },
            "id": { "type": "string" }
          }
        },
        "source_specific_fields": {
          "type": "object",
          "additionalProperties": true,
          "description": "Extra source-specific fields"
        }
      },
      "required": ["platform"]
    }
  },
  "required": ["timestamp", "message", "user", "source"]
}

Submission

To participate, submit a pull request with your solution, along with a .json with sample data, and request a review from @jalmonter. Expanding the pull request description with your methodology can help us better understand your reasoning and evaluate your submission faster.

🎉 @Hjklvfr submitted the winning pull request and claimed the bounty!

Improve taxonomies pagination & layout

We have more and more contributions to the content and now there is a pagination for custodians, with only 10 per page:
.

Could you change this number to 50? Is there any way to show the taxonomy other than buttons?

Milestone: Automated Market Health LLM Reporter V1

Version V1

Improvements to reporter service

Deliverables:

Add data retrieval via API
Improve output formatting
Add data saving in JSON
Add token counting
Add the ability to create a new pull request with a generated report
Add ability to create visualizations

Fix font

after the last pr website is acting strange in mobile view and just on the web:

Could you take a look please?

Implement collapsible section capability

creating an issue for this

Fix typos

          Misspelled: for **threat** modeling systems

Originally posted by @Lavriz in #3 (comment)

Fix meta info pulling

Some headers don't pulling - https://github.com/1712n/dn-institute/actions/runs/5205723830

Milestone: Automated Market Health LLM Reporter

Version: MVP

Deliverable:

Tool that creates automated weekly reports with metrics spikes and their interpretation.

Functionality:

Gets financial data from DNI API (currently the API provides historical data for a sliding window of one week)
Checks it for anomalies according to docs (isolated deviations from the thresholds can be random noise, but clustered excessive deviations can indicate systematic manipulation)
Generates report according to examples, showcasing the found anomalies and interpreting them according to the theoretical laws
Interface: github issue, user input parameters of pair, exchange and time period, output is a comment with a report

Data:

Data source (wash trading metrics): https://rapidapi.com/DNInstitute/api/crypto-market-health/
Article examples https://dn.institute/market-health/posts/
Theory https://dn.institute/market-health/docs/market-health-metrics/

Custom payout rates for the attack wiki

@jhirschkorn wants to incentivize creation of wiki pages that have certain tags/categories. There are a few approaches to this, here are just some implementation ideas:

Get the multiplier from the payout comment, i.e. payout\X2 or something.
Get the rates from a config file that lists all categories/tags that need to have their own per character rates along with the default rate.

Whichever is easier to implement. Feel free to use your own approach if it works for @jhirschkorn.

Fix diff splitting in fact-check process

Fact checking workflow doesn't work with PRs from forks

Follow up #24

Pull request
Job console

In a PR, the runtime context is the branch origin repository. This is a GH security policy and otherwise anyone could submit a malicious PR to leak secrets. So the workflow fails with PRs from forked repos because it will run in the wrong context without the GH read/write and the OpenAI key.

I have to study an alternative besides denying PRs from forked repos or adopting a self-hosted runner.

Implement gallery view on posts

Requested by #203 (comment)

Contribute to Market Manipulation Wiki

We invite you to enrich our Market Health wiki by identifying, analyzing, and documenting instances and methods of market manipulation, leveraging data and insights gained from various sources, including our free API that provides basic metrics related to wash trading activities. Comprehensive documentation on each metric is available here.

To participate, submit a pull request that either adds a new page or enhances existing pages in the market-health directory and ask for a review by the issue assignees. Ensure that your submissions are thorough, data-backed, and adhere to the submission guidelines detailed below.

Bounty

$500 for any accepted pull request with a single article. We'll prorate your submission up or down according to the pull request stats (20c/character). That said, our advice is to create smaller pull requests to speed up the review process and avoid any potential branch conflicts stemming from other people working on the same things in parallel. We will award additional bounties for fixing factual errors, suggesting accepted structural changes, and adding more advanced content to other parts of the wiki.

Claiming Your Bounty

Email your bitcoin/lightning payment address to [email protected]. Expect all payouts to be completed by the end of the month.

Submission Ideas

New articles discovering instances of manipulation on cryptocurrency exchanges
Improve existing content by:

Fixing factual errors
Adding more crypto data specific metrics, such as market venue orderbook snapshots, or executed order feed
Adding graphs or underlying datasets
Improve metrics documentation by adding real world examples and visualizations
Add learning resources (books, videos, online courses)

Submission Guidelines

Ensure your submission complies with the following:

Separate directory for your article in .md file and all supporting images and datasets. Use clear naming convention for all your files
Headers to be utilized:

Header name	Required	Description	Example
`date`	Yes	YYYY-MM-DD	2023-10-02
`entities`	Yes	Entities implicated in wash trading.	`Huobi` ,`HT` ,`TRX` ,`DOGE`
`title`	Yes	Article Title	`Uncovering Wash Trading and Market Manipulation on Huobi`

Concentrate on data and analysis, minimizing speculations and ambiguous language.
Support your ideas by adding references, datasets, graphs.
If possible, add more crypto data specific metrics, such as market venue orderbook snapshots, or executed order feed.
If required changes suggested by reviewers are not addressed within a week, the PR will be considered inactive and will be closed.

Background material

Update fact-check

Fix opeai incompletion answers

Integrate fact-checking action

Here is a solution that we need to integrate on this website. The solution is good, but I would appreciate your optimizations (e.g. he is using his personal token instead of a bot)

Update fact check

Fixes for images UX

Problem 1:

sometimes post contains multiple consecutive images, like here. Right now they are stacked one after the other taking too much space and not looking good.
images are not clickable and therefore hard to read (too small)

Fix 1:
Ideally, we would want to have a carousel/gallery view in the cases of several images stacked, so they would look like in Instagram, and users could horizontally swipe through them.

requirements:

something lightweight, straightforward and hugo native is strongly preferable
ability to click on images and open (enlarge) them
if several images are stacked - have a carousel view, that shows one bigger picture to showcase the chart and allows to click/swipe through all of them
captions

Problem 2:
Images inside text are not rendered anywhere in github (markdown, devmode or codespace), the only way to see them is after the page is added to the website

Fix 2:
Not sure how to deal with it, as far as I know current way of presenting pictures in the text was implemented to be able to make captions formattable (so they look like captions - centered, cursive, etc). This feature should not be lost

Topic dataset collection

The goal of this challenge is to collect a minimum of 200 tweets related to crypto custodians and at least one of the topics below.

Topic	Sentiment	Notes
hacker attacks	negative	DDoS, hacks, stolen funds, etc. - anything that relates to hacker attacks and security breaches at crypto custodians. See examples of the hacker attacks and attack-related tweets.
law enforcement	negative	anything that relates to law enforcement regarding projects in crypto, or their employees/board (custodians, tokens, protocols, etc): potential litigation, enforcement actions, court proceedings, etc. See examples of the law enforcement-related tweets
uptime problems	negative	Anything that would affect crypto custodian availability: downtime of any sort, matching orders engine issues, freezing website, API lags, planned and unplanned maintenance, service outages, etc. See examples of the uptime-related tweets
withdrawal issues	negative	Anything that prevents/slows transfers of the money out/in: withdrawals/deposits aren't possible, the fees aren't matching, the balance isn't updated, frozen wallets, prolonged system downtime and verification process, etc. See examples of the withdrawal-related tweets.
fraud	negative	Anything that implies illegal activity happening at crypto custodians: exit scams, pump-and-dump schemes, front running, wash trading, etc. See examples of the fraud-related tweets.

Submission

To participate, submit a pull request adding at least 200 tweets related to crypto custodians to at least one of the topic datasets and request a review from this issue assignee. Expanding the pull request description with your methodology can help us better understand your reasoning and evaluate your submission faster.

💡Hints

To avoid bias, do not Twitter-search for the topic-related keywords, such as “attack”, "withdrawal", "police", "fraud", "etc".
Find the dates of relevant events through Google News. This will help you narrow down your Twitter search later on.
Use topic modeling and sentiment analysis tools to filter out the most relevant tweets.

Fix pulling process for fact-check

move wiki reviewers to secrets

Fix point error in answer in fact checking

Some short prompts returning just point, and it can fixed

problem with payout-calc logic

          @danisztls, I think there is a problem with your payout-calc logic.

Originally posted by @evgenydmitriev in #76 (comment)

Spellchecker fixes

@JediFaust, here's the list of spellchecker fixes needed. Involve @danisztls and figure out LanguageTool paid account if needed.

Personal dictionary. We are running into problems with crypto named entities and technical terms. We can add our custom dictionary, but might need to contact LanguageTool people for that.
Turn off rules that generate false-positives. Examples: 1, 2.

Fix grammar check

Confident Predictions Selection

💸 Confident Predictions Selection is a bounty challenge 💸

Your task

To design a model capable of selecting the top 10% of predictions that will exhibit the smallest mean distance.
You have an option to work with either or both of the text data and raw predictions. Additionally, you may perform any form of aggregation or transformation on the raw predictions as you see fit.

Validation Metrics

The primary metric for this challenge is the mean distance of the selected top 10% predictions. Your objective is to minimize this value.
As a secondary metric, we designed the Class Representation Index (CRI). In essence, CRI compares the class distribution before and after filtering, giving a higher weight to classes that were initially larger. The primary purpose of this metric is to detect cases where a class is significantly less represented after filtering compared to its original size.

Please, see the dedicated repository for instructions

Prevent unintentional tagging of users in comments from quality check script

See comment.

Fact-check context sensitive

Fix payout conditions

Wash Trading Detection

Wash trading is a process whereby a trader buys and sells a security for the express purpose of feeding misleading information to the market. In some situations, wash trades are executed by a trader and a broker who are colluding with each other, and other times wash trades are executed by investors acting as both the buyer and the seller of the security.

Investopedia

In the crypto industry, wash trading is a common occurrence, as many traders equate high trading volumes to healthy market liquidity. Trading venues and crypto projects are complacent and often collude with each other to generate fake trades that cost little to place and execute. You can find more technical details on wash trading techniques in this YouTube video.

Please comment below with ideas on detecting wash trading based on a stream of executed trades. More specifically, describe an algorithm you would use to create a metric capable of measuring wash trading in a given market pair volume. Feel free to support your ideas by adding references, datasets, graphs, and code. Comments with the best ideas will be hidden to allow others to participate. Multiple submission awards are available.

Many of the previous challenge participants focused on investigative approaches that involved manual analysis of specific cases of wash trading. This, however, is an engineering challenge, requiring successful submissions to include an algorithm, supported by references, datasets, graphs, and/or code. We have more than enough of ingenious ideas on how it can be done, but no solid plans of how to implement it using real-time streaming data.

1712n / dn-institute Goto Github PK

dn-institute's People

Contributors

Stargazers

Watchers

Forkers

dn-institute's Issues

Submission ideas

Submission Guidelines

Tasks

How to Claim Bounty

Proposed menu structure

Notes

Your task

Overview

Data Schema

Submission

Version V1

Deliverables:

Version: MVP

Deliverable:

Functionality:

Data:

Bounty

Claiming Your Bounty

Submission Ideas

Submission Guidelines

Background material

Submission

💸 Confident Predictions Selection is a bounty challenge 💸

Your task

Validation Metrics

Please, see the dedicated repository for instructions

Recommend Projects

Recommend Topics

Recommend Org