Git Product home page Git Product logo

cryptocurrency-pump-dump's Introduction

Identifying and Analyzing Cryptocurrency Manipulations in Social Media

This repository includes the datasources used for the analysis in https://arxiv.org/abs/1902.03110.

Datasets:

  • Twitter: The original dataset collected from Sept 2017 to Aug 2018 includes all the tweets mentioning at least one cashtag in coin_list.csv. The resulting dataset contains 30,760,831 tweets and 3,708,176 users in total. To preserve users' privacy, only the timeseries extracted from the dataset are provided here.
    • sentiment-ts: Average sentiment of the tweets mentioning a coin at a timestamp.
    • tweets-ts: Number of tweets mentioning a coin at a timestamp.
    • users-ts: Number of unique users mentioning a coin in their tweets at a timestamp.
  • Telegram:
    • channels: Cryptocurrency related Telegram channels.
    • classified: The result of the processing Telegram messages corresponding to a pump attempt, labeled by the classifier explained in Section 4 of the paper.

Format:

  • Twitter timeseries: Each Twitter/x-ts consists of $coin_symbol.json files, containing x timeseries of the coin, where x could be sentiment, tweets or users. For example sentiment-ts/$BTC.json contains the average sentiment of the tweets mentioning $BTC, at each time stamp, with the following format:
    {"data": [[{"$date": 1524700800000}, 0.1], [{"$date": 1524700860000}, 0.2], ...]}

  • Telegram channels: Telegram/channels consists of channel_id.jl files, where each .jl file includes all the messages posted in a channel. Each line of a .jl file is a json object corresponding to a Telegram message object

  • Pump Attempts: Each record in a Telegram/classified/coin-pump.csv corresponds to a pump attempt, and includes the following fields:

    • Message ID: The id of the message provoking a pump attempt.
    • Channel ID: The id of the channel that the message has been posted on.
    • Date: The date that the message has been posted on the Telegram channel.
    • Time: The time that the message has been posted on the Telegram channel.
    • Coin: The coin that is the target of the pump operation.
  • Price Extracts: Telegram/classified/coin-pump.csv includes the price extracted from Telegram messages corresponding to a pump attempt. Each record has the following fields:

    • Message ID: The id of the message provoking a pump attempt.
    • Channel ID: The id of the channel that the message has been posted on.
    • Date: The date that the message has been posted on the Telegram channel.
    • Time: The time that the message has been posted on the Telegram channel.
    • Coin: The coin that is the target of the pump operation.
    • Buy: The price to buy the coin mentioned in the message.
    • Sell: A list of target prices that scammer wish to achieve by their pump operation mentioned in the message.

Citation

More detail on data collection and data analysis can be found in Identifying and Analyzing Cryptocurrency Manipulations in Social Media. Please cite our paper if you use the dataset.

@article{mirtaheri2019identifying,
  title={Identifying and Analyzing Cryptocurrency Manipulations in Social Media},
  author={Mirtaheri, Mehrnoosh and Abu-El-Haija, Sami and Morstatter, Fred and Steeg, Greg Ver and Galstyan, Aram},
  journal={arXiv preprint arXiv:1902.03110},
  year={2019}
}

cryptocurrency-pump-dump's People

Contributors

mehrnoom avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.