Git Product home page Git Product logo

unique-words-count's Introduction

Unique Words Counter by Binary Search Tree

Counting Unique Words in the given txt files by using Binary Search Tree
DOI

Content


The main aim of the program is to count certain keys in the given txt files. These keys are:

  • Lines
  • Symbols
  • Words
  • Unique Words

Program reads all txt files in the given directory. Symbols are considered as all characters including punctuations, spaces are not counted. Unique words are not case sensitive.

Collecting Unique words are implemented by Binary Search Tree, where also frequencies of Unique words are counted.


  • It is possible to run the file in Terminal Window or Anaconda Prompt and use argument -p specifying path to the directory where txt files will be read by program. By default program reads directory tests that is next to the main Python file. Run follwing: python words_count_Python.py -p /home/my_name/Downloads/txt_files

  • One more way is to import program as utility for instance in Jupyter notebook.

  • And of course, any Python IDE is as standard option.


As a result, program prints following (example):
Total lines = 4
Total symbols = 111
Total words = 22
Total unique words = 19

Final results, are written into the file results.txt next to the main Python file, where also Unique words with frequencies are written as shown below.

Plotting results

Also, Bar Charts are plotted as shown below.
Pay attention! If there are a lot of Unique words they will not be visible on the plot.

Plotting results


MIT License

Copyright (c) 2019 Valentyn N Sichkar

github.com/sichkar-valentyn

Reference to:

Valentyn N Sichkar. Unique Words Counter by Binary Search Tree // GitHub platform. DOI: 10.5281/zenodo.3558757

unique-words-count's People

Contributors

sichkar-valentyn avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.