Git Product home page Git Product logo

bookbot's Introduction

BookBot: Text Analysis Tool

Overview

BookBot is a Python tool designed to analyze the content of a book or any text file. It provides a detailed statistical report on the number of words and the frequency of each character (case-insensitive) in the text. This project is suitable for anyone needing to quickly gain insights into the textual content of their documents.

Features

  • Counts the total number of words in the text.
  • Calculates the frequency of each alphabetic character, ignoring case.
  • Generates a clear and sorted report of character frequencies.
  • Can be used as a standalone script or imported as a module for integration into other Python projects.
  • Includes error handling for common issues like missing or empty files.

Installation

  1. Clone the Repository:

    git clone https://github.com/DarkSideDani/bookbot
    cd bookbot
  2. Ensure Python is Installed: Make sure you have Python 3.x installed on your machine.

Usage

As a Standalone Script

  1. Save your text file (e.g., frankenstein.txt) in the books directory, or specify a path to your file.

  2. Run the script from the command line:

    python book_analyzer.py books/frankenstein.txt

    Replace books/frankenstein.txt with the path to your text file.

  3. Command-line Options: You can get help on how to use the script by running:

    python book_analyzer.py -h

    This will display the help message:

    usage: book_analyzer.py [-h] book_path
    
    Analyze a book and print a statistical report.
    
    positional arguments:
      book_path  Path to the book file
    
    optional arguments:
      -h, --help  show this help message and exit
    

As a Module

You can import the functions into your own Python scripts for further use:

from book_analyzer import get_num_words, get_num_chars

text = "Your text here"
print(get_num_words(text))
print(get_num_chars(text))

Functions

get_book_text(path)

  • Description: Reads the content of a book from the specified file path.
  • Parameters: path (str) - Path to the text file.
  • Returns: String containing the text from the file.
  • Error Handling: Raises FileNotFoundError if the file does not exist or ValueError if the file is empty.

get_num_words(text)

  • Description: Counts the number of words in the given text.
  • Parameters: text (str) - Text to analyze.
  • Returns: Integer representing the number of words.

get_num_chars(text)

  • Description: Counts the occurrences of each alphabetic character in the given text (case-insensitive).
  • Parameters: text (str) - Text to analyze.
  • Returns: Dictionary with characters as keys and their frequencies as values.

generate_report(book_path)

  • Description: Generates and prints a statistical report of the text file located at book_path.
  • Parameters: book_path (str) - Path to the text file.
  • Error Handling: Ensures the file is read correctly and outputs a report of the analysis.

main()

  • Description: Handles command-line interface for the script. Parses arguments and generates the report.

Example Output

When running the script with a sample text file, you might see:

--- Begin report of books/frankenstein.txt ---
1234 words found in the document

The 'a' character was found: 5423 times
The 'b' character was found: 2341 times
The 'c' character was found: 1234 times
...
--- End Report ---

Acknowledgments

This project was inspired by the Boot.dev course project, to perform simple text analysis and generate meaningful insights from written content/books.

bookbot's People

Contributors

darksidedani avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.