Topic: apache-tika Goto Github
Some thing interesting about apache-tika
Some thing interesting about apache-tika
apache-tika,Golang client for Apache Tika
User: alexferl
apache-tika,AWS Lambda code to index S3 buckets into Elasticsearch
User: aswath86
apache-tika,Developed a Spatial Search website that allow users to search documents from FBI Vault website. Extract the most frequently occurring location in each of documents, and load the geo-tagged data into Apache Solr to index the documents, visualize search results using the Google Maps API.
User: beccaliu
Home Page: http://youtu.be/s8Y-M0owH2g
apache-tika,Application in php to test load of pdf files, using docker-compose and apache-tika.
User: bjverde
apache-tika,Information Retrieval system for indexing and searching files stored on disk, with support for Romanian language
User: bogdankandra
apache-tika,Lucee wrapper for Apache Tika
User: cfsimplicity
apache-tika,This API use Annif as local server, NER component is included. It also includes Tesseract and uses Apache-tika software for language detection. It also has a limited multilingual support.
Organization: dalai-project
apache-tika,可以将word(doc、docx)、excel、pdf、ppt、csv、txt文件的文本内容提取出来,同时能够提取出word、pdf文件的目录
User: deep2018530
apache-tika,
User: ergottli
apache-tika,Python bindings for Apache Tika
User: fedelemantuano
Home Page: http://tika.apache.org/
apache-tika,Text extraction from scanned pdf documents in java
User: fraponyo94
apache-tika,This repository holds everything that is required to run the Apache Solr Engine and its functionality to crawl documents
Organization: gctools-outilsgc
Home Page: https://apache-solr-search.readthedocs.io/en/latest/
apache-tika,microservice web application for uploading and downloading audio files
User: glebshur
apache-tika,tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.
User: greed2411
apache-tika,Visualize unstructured data using Watson NLU
Organization: ibm
Home Page: https://developer.ibm.com/patterns/visualize-unstructured-text/
apache-tika,A security in mind file uploading web app
User: immontilla
Home Page: https://blog.immontilla.eu/a-security-in-mind-file-uploading-web-app/
apache-tika,Secure file uploader web application
User: immontilla
apache-tika,Directory tree metadata parser using Apache Tika
Organization: kairohm
apache-tika,🚴♂️⛷Data Lake, Performance tuning for text extraction from a huge amount of files.
User: kimtth
apache-tika,a tool set for indexing and searching through documents
User: kubachrabanski
apache-tika,Using Apache Lucene, TIKI, Solr
User: ldkhanh
apache-tika,
User: lu-16
apache-tika,Document management system implemented with microservices
Organization: maxsquared-webcraft
apache-tika,Tika detector for MKV and WebM
User: omarassadi
apache-tika,Apache Tika adapter in Go
Organization: orijtech
apache-tika, بفهرسة اغلب انواع الوثائق والبحث فيها , استبدال العملات وتوحيد صيغ التواريخ والاوقات , يدعم الوثائق شبه المهيكلة باعطاء وزن اعلى للتاغ ذو الاهميه الاكبر, ويوسع الاستعلام باخذ مرادفات مفرداته باستخدام مكتبة ووردنت
User: raeedfarhan9
apache-tika,Apache Tika integration built in scala for indexing OneDrive files into ElasticSearch.
User: ryanquey
apache-tika,Apache Tika - Toolkit detects and extracts metadata
User: saidsef
apache-tika,A vanilla PHP wrapper for Apache Tika and Google Cloud Translate to help them work in harmony.
Organization: selesti
apache-tika,AWS Lambda layer containing latest version of Apache Tika
Organization: shelfio
apache-tika,Extract text from a document by Apache Tika
Organization: shelfio
Home Page: https://www.npmjs.com/package/tika-text-extract
apache-tika,PDF parsing and extraction utility using Apache Tika
User: sidmishraw
apache-tika,[SLOW][WIP] Broodmother is a high performance, distributed, search engine using Apache Tika, Apache Solr, Akka, Neo4j, and Spring.
User: sidmishraw
apache-tika,Analysis of PixStory social media data combined with Snapchat, COVID-19, and YouTube data. This project uses the Apache Tika Clustering software to cluster certain social media posts together.
User: todd-gavin
apache-tika,Extraction analysis of PixStory Social Media Dataset using language detection, language translation, tike geotopic parser, tika image object recognition/image caption generation, and PyTorch detoxify.
User: todd-gavin
apache-tika,ApacheDeepLearning101
User: tspannhw
apache-tika,Apache NiFi + Apache Tika + OptimaizeLangDetector
User: tspannhw
apache-tika,All my processors (NARs) in one place
User: tspannhw
apache-tika,Open Source Computer Vision with TensorFlow, MiniFi, Apache NiFi, OpenCV, Apache Tika and Python For processing images from IoT devices like Raspberry Pis, NVidia Jetson TX1, NanoPi Duos and more which are equipped with attached cameras or external USB webcams, we use Python to interface via OpenCV and PiCamera. From there we run image processing at the edge on these IoT device using OpenCV and TensorFlow to determine attributes and image analytics. A pache MiniFi coordinates running these Python scripts and decides when and what to send from that analysis and the image to a remote Apache NiFi server for additional processing. At the Apache NiFi cluster in the cluster it routes the images to one processing path and the JSON encoded metadata to another flow. The JSON data (with it's schema referenced from a central Schema Registry) is routed and routed using Record Processing and SQL, this data in enriched and augment before conversion to AVRO to be send via Apache Kafka to SAM. Streaming Analytics Manager then does deeper processing on this stream and others including weather and twitter to determine what should be done on this data. References https://community.hortonworks.com/articles/103863/using-an-asus-tinkerboard-with-tensorflow-and-pyth.html https://community.hortonworks.com/articles/118132/minifi-capturing-converting-tensorflow-inception-t.html https://github.com/tspannhw/rpi-noir-screen https://community.hortonworks.com/articles/77988/ingest-remote-camera-images-from-raspberry-pi-via.html https://community.hortonworks.com/articles/107379/minifi-for-image-capture-and-ingestion-from-raspbe.html https://community.hortonworks.com/articles/58265/analyzing-images-in-hdf-20-using-tensorflow.html
User: tspannhw
apache-tika,A place to release saved machine learning models for tika-dl
Organization: uscdatascience
apache-tika,A suite of Machine Learning / Deep Learning Dockerfiles to allow Apache Tika to extract objects and to produce textual captions for images and video
Organization: uscdatascience
apache-tika,A simple information retrieval system, a PDF Search Engine for UN agencies and NGOs.
User: yashajoshi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.