Topic: tika Goto Github
Some thing interesting about tika
Some thing interesting about tika
tika,📧 Analysis of Cyber Phishing Emails: Fraudulent Emails and Social Engineering.
User: anthonyive
tika,👨🦰 Large Scale Active Social Engineering Defense (ASED): Multimedia and Social Engineering
User: anthonyive
tika,The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Organization: apache
Home Page: https://tika.apache.org/
tika,Convenience Docker images for Apache Tika Server
Organization: apache
Home Page: https://tika.apache.org/
tika,An Elasticsearch engine plugin for Moodle's Global Search
Organization: catalyst
Home Page: https://moodle.org/plugins/search_elastic
tika,The Distributed Release Audit Tool (DRAT) for code analysis and verification.
User: chrismattmann
Home Page: http://chrismattmann.github.io/drat/
tika,ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
User: chrismattmann
tika,Code for Machine Learning with TensorFlow: 2nd Edition Published by Manning Publications
User: chrismattmann
Home Page: http://github.com/chrismattmann/MLwithTensorFlow2ed
tika,Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
User: chrismattmann
tika,A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.
User: chrismattmann
Home Page: http://polar.usc.edu/
tika,Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
Organization: cogstack
Home Page: https://cogstack.atlassian.net/wiki/spaces/COGDOC/overview
tika,Elasticsearch File System Crawler (FS Crawler)
User: dadoonet
Home Page: https://fscrawler.readthedocs.io/
tika,Incremental crawling capabilities for Apache Tika. Crawl content out of e.g. file systems, http(s) sources (webcrawling) imap(s) servers or your own arbitrary data sources. LeechCrawler offers additional Tika parsers providing these crawling capabilities.
Organization: dfki
Home Page: https://github.com/DFKI/leechcrawler
tika,Python bindings for Apache Tika
User: fedelemantuano
Home Page: http://tika.apache.org/
tika,TYPO3 Extension: solr_file_indexer
Organization: hmmh
tika,Tika based link (URL) extractor for httpreserve
Organization: httpreserve
tika,Java web application taking IPFS hashes, extracting (textual) content and metadata through Apache's Tika.
Organization: ipfs-search
tika,Search Engine projects
User: keerthivasan13
tika,Use the Java Tika text extraction library on the .NET platform
User: kevm
Home Page: http://kevm.github.io/tikaondotnet/
tika,git diff settings for Microsoft Office files
User: lagenorhynque
tika,Processing system for the search engine service in Liquid Investigations.
Organization: liquidinvestigations
tika,Tika per page PDF extractor server returning content as JSON.
User: mkalus
tika,A ruby wrapper for the Tika jar (tika-app.jar) that extracts text in a lot of formats from PDF, xls, doc, etc files
User: mrcsparker
Home Page: https://github.com/mrcsparker/ruby_tika_app
tika,Extract and Visualize location from any file
Organization: nasa-jpl-memex
tika,To cluster geo paths that travel very similar paths
Organization: nasa-jpl-memex
tika,Interactive Image similarity and Visual Search and Retrieval application
Organization: nasa-jpl-memex
tika,Viewers for statistics and dashboarding of Domain Search Engine data
Organization: nasa-jpl-memex
tika,Tesseract OCR wrapper for Apache Tika and/or Open Semantic ETL caching the OCR results, so Tika-Server or Open Semantic ETL has not to reprocess slow and expensive OCR on same images again
User: opensemanticsearch
tika,Apache Tika Server as Debian GNU/Linux and Ubuntu Linux package
User: opensemanticsearch
Home Page: https://opensemanticsearch.org
tika,Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
Organization: opensextant
tika,Distill information about amendments to the Oregon Revised Statutes.
Organization: public-law
tika,Quarkus Tika extension
Organization: quarkiverse
Home Page: https://tika.apache.org/
tika,R Interface to Apache Tika
Organization: ropensci
Home Page: https://docs.ropensci.org/rtika
tika,Apache Tika Server as a Background Service in Node.js
User: rse
Home Page: http://npmjs.com/tika-server
tika,spark hdfs tika
User: scotthaleen
tika,Small box of pandora to prototype your app with ready for use backend. This is just my compilation of different solutions occasionally applied in hackathons and challenges
User: sergeyt
Home Page: https://tsvbits.com/pandora/
tika,📄🚀 Unleash a powerful Document Search Engine with Apache NiFi for lightning-fast, comprehensive text indexing and search.
User: sergio11
tika,pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
User: shebinleo
Home Page: https://www.npmjs.com/package/pdf2html
tika,Extract text from a document by Apache Tika
Organization: shelfio
Home Page: https://www.npmjs.com/package/tika-text-extract
tika,An examine indexer that uses Apache Tika.
Organization: thecogworks
tika,Apache NiFi Custom Processor Extracting Text From Files with Apache Tika
User: tspannhw
tika,A TYPO3 CMS extension that provides Apache Tika functionality
Organization: typo3-solr
tika,Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Organization: uscdatascience
Home Page: http://irds.usc.edu/sparkler/
tika,A suite of Machine Learning / Deep Learning Dockerfiles to allow Apache Tika to extract objects and to produce textual captions for images and video
Organization: uscdatascience
tika,Apache Tika bindings for PHP: extract text and metadata from documents, images and other formats
User: vaites
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.