cul-it / archival-storage-interface Goto Github PK
View Code? Open in Web Editor NEWSearch and discovery interface for CUL archival storage
Search and discovery interface for CUL archival storage
A long-term goal for the AS-IF project may be to be able to do discovery of archival resources on other archival systems other than Overflow.
For now, we are only looking at Overflow, and will not be attempting to integrate other systems.
As a possible integration method, this could involve writing indexers for the manifests from the other sources and updating what information we care about searching on and displaying.
See Use Case 1 at https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
See Use Case 1 of https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
The original use case document was uncertain as to the need for this. Offhand, it does not fit well with the overflow storage model, where the collection label is embedded in the storage system.
We shouldn't open this up to the world.
See Use Case 1 of https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
Configure Travis to automatically deploy every successful build to bb233-dev for testing and feedback.
"Colllection" should be changes to "Collection"
If bibid = 5991856 then catalog link is https://newcatalog.library.cornell.edu/catalog/5991856
Existing collections in archival storage have a manifest file in the "old" format. Once the "new" format it finalized, the new collections will follow that new format.
We need a utility that will convert old manifest files to the new format, suitable for ingestion into AS-IF.
For example I might want to find a file by searching for the sha1 digest da39a3ee5e6b4b0d3255bfef95601890afd80709
or perhaps just the start da39a3ee5e6b4b
See Use Case 2 of https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
This is basically to support the workflow used by the current CULAR ingests, where Michelle/Dianne/Mira/Erin/etc look at the tool tips in the CULAR admin interface to determine if everything was ingested.
For collections in archival storage, we need to be able to import the manifest into AS-IF so that the archived material can be discovered.
This needs to be done for each collection in archival storage after the ingest process is done (or part of the ingest process).
Unfortunately, the manifest does not contain type information about archived objects, so the importation process will have to walk the collection storage itself and test the type of each object to get that information.
Some potential collections will have visibility constraints because of rights management. We need to handle this appropriately.
Instead of having to know the whole SHA1 checksum, we should be able to search for just the prefix, similar to how it is handled in git.
Code coverage definitely isn't working as expected.
I want it to look at Rails code I have written, and right now I'm not sure it's running at all.
Looks like there is a Solr datatype issue with file sizes that results in negative numbers showing
How much occupied space, and how much free space?
Do we also want to know what collections are on it?
See use case 2 in https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
per discussion in meeting
Currently, when looking at the results of a search, you are presented with a list of item id's and no other information. As the item id's are MD5-style hash strings, this provides no useful information.
The index page should not show the ID, but rather the filename, collection name, and perhaps the type as well.
Need to add sha1 checksum field to searchable fields.
This is in support of use case #1. That use case also wants to be able to search on SHA1 prefix.
see use Case 1 of https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
Searching for "ZIP" or "ZIP archive" yields no results, despite two items of that type.
File type is not being tokenized in indexing, and needs to be.
See Use Case 1 of https://confluence.cornell.edu/pages/viewpage.action?pageId=340904277
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.