Git Product home page Git Product logo

watsonx-doc-to-csv-generator's Introduction

Document to CSV Generator

Document to CSV Generator is a robust application designed to facilitate the efficient transformation of document content into structured CSV files. This tool allows users to easily upload multiple documents and define specific headers for the CSV columns directly within the application. Each header can be accompanied by a detailed column description to clarify the data extraction process.

Technology Highlights

  • Streamlit Framework: Built on Streamlit to provide a smooth and interactive user experience, facilitating quick setup and real-time data processing.
  • Watsonx AI and Llama 3 by Meta: Integrates Watsonx AI with Llama 3, Meta’s latest large language model, ensuring top-notch accuracy and efficiency in text analysis and data extraction.

Key Functionalities

  • Column Customization: Users can define column names and provide detailed descriptions for each, enhancing understanding and control over data extraction.
  • Multiple Document Upload: The app supports uploading several documents simultaneously, allowing for bulk data processing.
  • Intelligent Data Extraction: Leveraging advanced algorithms, the application extracts relevant data from the uploaded documents and aligns it under the designated headers in the CSV.

Document to CSV Generator streamlines data extraction and organization, making it an indispensable tool for data analysis and management tasks.

Table of Contents

image image image image

Installation

Prerequisites

  • Python 3.7 or higher
  • pip (Python package installer)
  • virtualenv (for creating isolated Python environments)

Steps

  1. Clone the repository:

    git clone https://github.com/yourusername/your-repo-name.git
    cd your-repo-name
  2. Install virtualenv if you don't have it:

    pip install virtualenv

Setting Up the Python Environment

  1. Create a virtual environment:

    virtualenv venv
  2. Install the required dependencies:

    pip install -r requirements.txt

Activating the Python Environment

On Windows

```sh
.\venv\Scripts\activate
```

On macOS and Linux

```sh
source venv/bin/activate
```

Running the Streamlit App

  1. Make sure your virtual environment is activated:

    source venv/bin/activate # On macOS and Linux
    .\venv\Scripts\activate # On Windows
  2. Run the Streamlit app:

    streamlit run watsonx-app.py

watsonx-doc-to-csv-generator's People

Contributors

artreimus avatar

Watchers

 avatar

Forkers

abd-al-rahmanh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.