Git Product home page Git Product logo

datawakedepot's People

Contributors

bmcdougald avatar bwhiteman avatar jreeme avatar justinlueders avatar lukewendling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datawakedepot's Issues

Keep info panel open

Once the information panel is opened, it should stay open.

Steps:

  • Visit a site
  • Toggle the info panel
  • click on a link
    The panel is closed and the user must reopen the panel.

Recursive Delete is broken for Trails and Domains

Deleting a domain should also delete its domain entity types and domain items.

Deleting a trail should also delete trail urls and trail url extractions.

This is probably most easily accomplished by modifying the existing recursive delete in the service, however, the best design moving forward is to modify the model.js file for dw-domain and dw-trail to have an on-delete action. There is an existing loopback ticket regarding this issue (loopbackio/loopback-datasource-juggler#88 (comment))

Refactor Domain Item and Domain Entity Types

Domain Items and Domain Entity Types should be chosen by the user. To do so we need to modify the Domain Item and Entity Type models.

This includes the method to add an item and type from an Extracted Entity in the Depot. The first step is to modify the models. Second step is to modify the Depot extracted entity page so that the user can click an Extracted Item to add it or its Type to a Domain. Third step will be a separate issue to enable this functionality from the Extracted Entity Panel Widget #43 .

TrailUrl Search terms should be persisted

Forensic currently calculates the searchTerms for a url each time it runs. Instead we need to determine and persist the search terms when the url is added during trailing.

I've added a searchTerms property to the dw-trail-url.json model but nothing currently populates it.

Memex Domain Export.

We need to be able to export a domain for use by the crawling teams. The domain should be the aggregate of all of the trails within a domain (At some point we may want to be able to choose specific trails.)

The format should be similar to

{
  "urls": ["http://la.backpage.com/1234", "http://la.backpage.com/1234", "http://www.wikipedia.com"], //All visited urls 
  "topLevelDomains": ["http://la.backpage.com"], //common top level domains 
  "searchTerms": ["escorts","las angeles","massage"], //all search terms
  "domainEntities": ["cherry","mimi","pasadena"], //entities added by a user.
  "domainEntityTypes": ["person","place","bitcoin address"],  //entity types added by a user.
  "commonEntities": ["massage", "parlor","las angeles","pasadena"]  //top 20 most extracted entities.
}

Word cloud

Build a wordcloud in the forensic view showing the extracted entities for a trail.

Create Suggested Link Widget

for incorporating search results from other services. This will most likely include a model change to relate suggested urls to a given TrailUrl.

Forensic Entity Grid Urls

The entity grid in forensic is only displaying a single url for entities that were extracted from multiple urls

Refresh looses the user

When clicking refresh in the browser, the current user is lost throughout the app requiring the user to sign back in.

Adding a team to a User gives an error on save

If you try to add a user to a “Team” you get an error on Save and it doesn’t show it in the table. Something about the Amino User instance isn’t valid 422.

You can go to the Teams page and add Users to the Team from there as a workaround.

TrailUrls should be filtered by Trail

The Trail URL's page should have a drop down with the user's trails. Once a trail is selected, the list should populate with URL's for that trail

User dashboard

A normal user should only have access to Domains, Trails, and forensic.

URLExtractions list should be filtered by domain, trail, and url

The URLExtractions list view needs to be filtered by domain, trail, and URL. Only the current users domains and trails should be available.

The current behavior causes OOM errors and can take a long time to load wile not providing any value to the user.

As a work around, we could remove this page and use only the entities grid in forensic.

Add extractor source to extractions

For analysis especially if using multiple extractors that do the same task (NER) we need to include an extractor source field to the URL extraction and have all the extractors add their name.

Configurable context menu in plugin

The plugin should have a context menu that is configurable to add integration with other memex tools such as search in dig, tellfinder, or imagespace. It should also be context specific for selected text or image.

Add new values to the User

Need to update the user account whenever a new or change is made for that user involving Teams, Domains, or Trails. Currently the user has to log out and back in again to see the change reflected.

Page importance ("Page Rank" panel widget)

The user should be able to mark a page as either relevant or irrelevant. This could be as simple as a check/x button in the plugin or div the default should be unspecified.

Forensic page Entities & Visited Links tabs have table issues

Two issues are immediately apparent on both of these tabs. First, if one of the columns (e.g. URL) is long, for example having multiple URLs listed, it makes the column very large. This pushes subsequent columns to the right. It easily can create a situation where the table width is then wider than your screen. This would be only a minor annoyance if there was horizontal scroll capability, but there isn't, so anything pushed beyond your window view cannot be seen. You cannot drag/resize the column widths either to attempt to resolve this.

Second issue with the tables are that even though they appear to have sorting capability at each column heading...it does not appear to work.

Depot Scrolling can break

I've noticed behavior where if you have "scrolling" set for the depot and view the tab for URLs visited when it has enough content to scroll off the browser window you might see an issue where when you scroll to the bottom, you may not be able to get back to the top.

Not sure what triggers this because it doesn't happen all the time. I thought it was triggered by starting a trail on another tab then coming back to the depot where you were scrolled to the bottom. But that can't cause it reliably either. When the error occurs, a white band for the title of the page (i.e. Forensic) extends through the side bar as well. when in that state, scrolling is broken. Suspect this is a bug with the admin template

Deleting an extractor from depot list hangs depot page

This is an odd bug, but easy to repro. Branch being tested is 32-extractor-source-from-develop (but that may not be important to this issue specifically).

Behavior is if you have multiple extractors in the extractors management page of the depot and click on the icon to delete one of the extractors, you get the confirmation message to which you select yes to delete. The list goes blank as if it deleted all the other extractors too, and the loading bar at the top of the page starts scrolling. It gets almost to the end and just hangs there. The only way to resolve this is to logout and then back in again. When you do, you'll see that the one you deleted is gone and the others did not get deleted inadvertently.

User can view all users

A user is able to view all users on the system. This is a security concern as users should not know what other users and teams are on the system.

Selecting Trail Extractions crashes node server

On branch 32-extractor-source-from-develop after trailing and getting extractions, if I click on Trail Extractions link in the sidebar node server crashes with the following error every time:

/home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/express/lib/response.js:242
var body = JSON.stringify(val, replacer, spaces);
^
RangeError: Invalid string length
at join (native)
at Object.stringify (native)
at ServerResponse.json (/home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/express/lib/response.js:242:19)
at Object.sendBodyJson as sendBody
at HttpContext.done (/home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/strong-remoting/lib/http-context.js:632:22)
at /home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/strong-remoting/lib/rest-adapter.js:459:11
at /home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/async/lib/async.js:251:17
at /home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/async/lib/async.js:154:25
at /home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/async/lib/async.js:248:21
at /home/ubuntu/src/DatawakeDepot/node_modules/loopback/node_modules/async/lib/async.js:612:34

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.