Git Product home page Git Product logo

glossary's Introduction

glossary

glossary is a JavaScript module that extracts keywords from text (aka "term extraction" or "auto tagging"). It takes a string of text and returns an array of terms that are relevant to the content:

var glossary = require("glossary");

var keywords = glossary.extract("Her cake shop is the best in the business");

console.log(keywords)  // ["cake", "shop", "cake shop", "business"]

glossary is standalone and uses part-of-speech analysis to extract the relevant terms.

install

For node with npm:

npm install glossary

API

blacklisting

Use blacklist to remove unwanted terms from any extraction:

var glossary = require("glossary")({
   blacklist: ["library", "script", "api", "function"]
});

var keywords = glossary.extract("JavaScript color conversion library");

console.log(keywords); // ["color", "conversion"]

minimum frequency

Use minFreq to limit the terms to only those that occur with a certain frequency:

var glossary = require("glossary")({ minFreq: 2 });

var keywords = glossary.extract("Kasey's pears are the best pears in Canada");

console.log(keywords); // ["pears"]

sub-terms

Use collapse to remove terms that are sub-terms of other terms:

var glossary = require("glossary")({ collapse: true });

var keywords = glossary.extract("The Middle East crisis is getting worse");

console.log(keywords); // ["Middle East crisis"]

verbose output

Use verbose to also get the count of each term:

var glossary = require("./glossary")({ verbose: true });

var keywords = glossary.extract("The pears from the farm are good");

console.log(keywords); // [ { word: 'pears', count: 1 }, { word: 'farm', count: 1 } ]

minSize output

Use minSize to limit the terms to a minimum size

var glossary = require("./glossary")({ minSize: 5 });

var keywords = glossary.extract("I like to eat pears in the garden");

console.log(keywords); // [ 'garden' ]

propers

glossary Uses jspos for POS tagging. It's inspired by the python module topia.termextract.

glossary's People

Contributors

harthur avatar

Watchers

James Cloos avatar Joshua Benson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.