Git Product home page Git Product logo

twitteraccountclassification's Introduction

TwitterAccountClassification

Classification of Twitter accounts for individual (personal) and superindividual (corporate and character).

Library consist of two main modules: Algorithm and DataExtraction.

Algorithm

Module is designed to classify types of Twitter accounts. Example of using this you can find in Demo module.

To modify module you can set twitter_classifier.properties file into project folder. The following is a list of parameters that can be specified in this file:

debug — boolean variable, if set, classifier is run in debug mode and may output additional info to the console (default: false)

frequency_analyzer — package of needed frequency analyzer (default: com.samborskiy.entity.analyzers.frequency.FrequencyDictionary)

grammar_analyzer — package of needed frequency analyzer (default: com.samborskiy.entity.analyzers.frequency.JLanguageToolGrammarCheckerRu)

morphological_analyzer — package of needed frequency analyzer (default: com.samborskiy.entity.analyzers.frequency.SimpleMorphologicalAnalyzer)

sentence_analyzer — package of needed frequency analyzer (default: com.samborskiy.entity.analyzers.frequency.SimpleTweetParser)

attributes — path to file with attributes names which will be used to classify (default: res/random_forest_attributes)

DataExtraction

Support module to sampling. To begin to use this you should to set twitter4j.properties file into project folder.

Also you need to create configuration file the following form (example):

{
  "lang": String, language of sample (example: "ru"),
  "databasePath": String, path to database file (example: "res/ru/test.db"),
  "types": array of Type, type of accounts
}

Type:

{
  "id": int, class id using to classify (example: 0),
  "name": String, human readable identify of type (example: "personal"),
  "tweetPerUser": int, number of tweet which will be download per user (example: 500),
  "data": Data, list of screen names of user ids which will be download
}

Data:

{
  "screenNames": String, path to file with list of screen names to download (example: "res/ru/screen_names_0")
  or
  "userIds": String, path to file with list of user ids to download (example: "res/ru/user_ids_0")
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.