Git Product home page Git Product logo

bogrep's People

Contributors

quambene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

hbcbh1999 cocowan

bogrep's Issues

Spacing errors in cached files

If there is a link in the text or formatting like italics spaces are omitted between the linked text and adjacent words.

In the example below, popularCookie Clicker, "Cookie Clicker" is a link, but cached there is no space between it and the preceding word "popular".

In apaperclip maximizer, "paperclip maximizer" is italicized and cached there is no space between it and the preceding word "a".

In Thissoundslike the italicized "sounds" is run together with the words before and after it.

Also, the paragraphs are run together instead of having a newline between them.

The example URL is active so you can see the original document.

$ bogrep paperclips
Match in bookmark: https://www.vice.com/en/article/xwgnxq/this-game-about-paperclips-will-make-you-ponder-the-apocalypse Ever since the wildly popularCookie Clicker,idle clicker games have been about hockey stick curves, about exponential growth unleashed by multiplicative advances in productivity. InCookie Clicker,that was employed in service of an absurdist joke about cookies.Universal Paperclips,a new free game from designer Frank Lantz, instead takes this to its darkly literalistic conclusion.It's a clicker game where you play as apaperclip maximizer,an AI that, once tasked with making paperclips, proceeds to turn the entire universe into paperclips.Thissoundslike a premise arrived at specifically to spoof clicker games, but it harkens back to a thought experiment proposed by Nick Bostrom, an Oxford philosophy professor, in a2003 paper:The risks in developing superintelligence include the risk of failure to give it the supergoal of philanthropy. […] Another way for it to happen is that a well-meaning team of programmers make a big mistake in designing its goal system. This could resul

Improve line breaks for plaintext parsing

Plaintext of fetched bookmarks is cached in a single line. Instead each paragraph in HTML should translate to a new line in the parsed plaintext. This will lead to better grepability.

Feature Request: Flag to Only list URLs

Like grep or RipGrep there should be a -l flag to not show the text of the match, but only the URLs that have a match. This can be aproximated with bogrep <SEARCH> | grep 'Match in bookmark'

Format not supported for bookmark file

When I try to import by bookmarks I get this error message:
Error: Format not supported for bookmark file

I tried using a source that points to my Firefox Profile dir, My Chrome Bookmarks, a json file export of my Firefox bookmarks, and an export file of my Chrome bookmarks. I deleted settings.json between each bogrep config --source command.

I am running on Mac OS 11.7.10 (Big Sur). The unit and integration tests all pass.

How can I troubleshoot this issue further?

Cache Mode ignored in settings

With a clean import (and cache cleaned), using bogrep import & bogrep fetch the cache is filled with .txt files. Again after cleaning, bogrep import & bogrep fetch --mode markdown fills the cache with .md files. These are expected behaviors.

After cleaning again and editing settings.json setting "cache_mode": "markdown", and running bogrep import & bogrep fetch the cache is filled with .txt files. The expected behavior should be to download .md files.

GIF files fetched

It seems that gif files are not excluded when fetching and are put into the cache.
I hadhttp://sirocco.accuweather.com/sat_mosaic_640x480_public/rs/isarNE.gif in my bookmarks and when I did a bogrep search I got a Match in bookmark with that URL and binary junk. The query I made happened to match some of the binary data.

Fetch error: Too many open files

When trying to do a bogrep fetch I am getting the error below.

$ bogrep fetch
Error: Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/78aa542f-52c1-4b5e-b475-15293854996a.txt: Too many open files (os error 24)
$ (140/8005)

I tried setting "max_concurrent_requests": 50, and still get this issue.

OS: Darwin 23.1.0 - macOS 14.1.1 (Sonoma)
version: bogrep 0.5.0

thread 'main' panicked

A search panicked the main thread. Possibly a pdf was included in the cache?

RUST_BACKTRACE=1 bogrep  fugitive
thread 'main' panicked at 'byte index 7369 is not a char boundary; it is inside '\u{a0}' (bytes 7368..7370) of `#REF!Far From The Maddening Crowdby David Nicholls (Based on the novel by Thomas Hardy)   2015september 2013 final shooting     x kb   pdf formatimdbFargoby Joel & Ethan Coen   1996undated, unspecified draft   106 kb   html formatimdbFarg`[...]', src/cmd/search.rs:86:35
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::str::slice_error_fail_rt
   3: core::str::slice_error_fail
   4: bogrep::cmd::search::search_bookmarks
   5: bogrep::cmd::search::search
   6: bogrep::main::{{closure}}
   7: tokio::runtime::park::CachedParkThread::block_on
   8: tokio::runtime::context::runtime::enter_runtime
   9: tokio::runtime::runtime::Runtime::block_on
  10: bogrep::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Here is the full backtrace:

RUST_BACKTRACE=full bogrep fugitive
thread 'main' panicked at 'byte index 7369 is not a char boundary; it is inside '\u{a0}' (bytes 7368..7370) of `#REF!Far From The Maddening Crowdby David Nicholls (Based on the novel by Thomas Hardy)   2015september 2013 final shooting     x kb   pdf formatimdbFargoby Joel & Ethan Coen   1996undated, unspecified draft   106 kb   html formatimdbFarg`[...]', src/cmd/search.rs:86:35
stack backtrace:
   0:        0x10224f8c8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he69c0e17cb41f255
   1:        0x10227a90b - core::fmt::write::h66293df4c7dd941a
   2:        0x102233206 - std::io::Write::write_fmt::h2f5a7ea5f48a0b56
   3:        0x10224f690 - std::sys_common::backtrace::print::h71fd332624ce1826
   4:        0x1022506f5 - std::panicking::default_hook::{{closure}}::ha2a0e70fb3678142
   5:        0x102250471 - std::panicking::default_hook::hb166cd42dec7ff92
   6:        0x102250cb8 - std::panicking::rust_panic_with_hook::h2b924837648ff0c0
   7:        0x102250bf3 - std::panicking::begin_panic_handler::{{closure}}::h04e24a68d30d9f5c
   8:        0x10224faf9 - std::sys_common::backtrace::__rust_end_short_backtrace::hd45b5152c8265971
   9:        0x10225096d - _rust_begin_unwind
  10:        0x1022a2003 - core::panicking::panic_fmt::h9302663e63786640
  11:        0x102275f22 - core::str::slice_error_fail_rt::h16947361fdce3fc4
  12:        0x1022a1ee9 - core::str::slice_error_fail::hc7dbb20721e2925b
  13:        0x101df3cee - bogrep::cmd::search::search_bookmarks::he87011c56d994e70
  14:        0x101df10bb - bogrep::cmd::search::search::h7c9b200747f586e2
  15:        0x101d9f7f0 - bogrep::main::{{closure}}::h9bf999662de91d64
  16:        0x101d9f08b - tokio::runtime::park::CachedParkThread::block_on::h475f5be8938b1cdf
  17:        0x101d832ef - tokio::runtime::context::runtime::enter_runtime::h0c39744fbd9d979e
  18:        0x101d8ea91 - tokio::runtime::runtime::Runtime::block_on::hcd0b6f794fbc0b1a
  19:        0x101d7ff7b - bogrep::main::h2c2d8feaaf0c8766
  20:        0x101d6da36 - std::sys_common::backtrace::__rust_begin_short_backtrace::h3044435b1b36dee6
  21:        0x101d6da51 - std::rt::lang_start::{{closure}}::h849919968bfedf8b
  22:        0x102250854 - std::panicking::try::hb5cb29dbfee1dcfc
  23:        0x10223bd6e - std::rt::lang_start_internal::h634e63ff6023f727
  24:        0x101d8005c - _main

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.