ble version : 0.4.0-devel2+420c933 Bash version :

Long processing times when pasting large amounts of text,about akinomyoga/ble.sh

Comments (18)

dylankb commented on May 20, 2024 1

For future reference if others run into the same issue - when pasting large amounts of text that I don't need to edit before pasting, I've just been using pbpaste | .... This gets around the pasting into an editor latency issue entirely, hehe (should have thought of this before).

from ble.sh.

dylankb commented on May 20, 2024 1

I'm on macOS and your suggestion seems to work perfectly :)

Possible values are e.g. "vim", "emacs -nw" or "nano".

When I read this I overlooked the "e.g." bit and thought that only these three editors were supported. I also tried bleopt editor=code --wait on a whim and got an error, but that was a shell quoting mistake on my part.

from ble.sh.

dylankb commented on May 20, 2024

The example paste size was 500~ lines, but anecdotally I've noticed if it's around 2000 it becomes very difficult to tell if the terminal is processing the input at all. There's a quick "Pasting" message and then nothing quite a long time. I initially thought the session had crashed but it does start do decode/process eventually.

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you for the report!

Maybe this issue is unavoidable given the syntax highlighting tools, but I wanted to check first.

I recognize this performance issue. If you really want to edit the pasted contents, unfortunately it's unavoidable since all the command line contents are processed by shell scripts which is slow. I have been thinking of this issue for long time, but not yet found a perfect solution. The following are some ideas which might partially solve the issue.

Abort Processing

If you just mistakenly pasted the large amount of texts and want to cancel the processing, you can press C-\ to abort the process and discard the inputs. You need to press C-\ as soon as possible when you noticed the mistake. (The key which can be used to abort the processing can be configured by bleopt decode_abort_char=INT where INT is an integer in the range 0-31 or 63127 (corresponding to C-@..C-_ and DEL).

Set upper limit (not implemented)

For this issue, once I thought about setting upper limit of the number of lines, or the number of characters in a command line. But I don't know the appropriate value for the upper limit as it depends on the computational power of each host. This is a matter of degree. For example, even if you use raw Bash (without ble.sh), I didn't try but it should become slow if you paste 1 M lines. Also I didn't think there is a case that users wants to edit so many lines of command. In fact the line editing in original Bash doesn't work properly with lines more than the terminal height. So I haven't set any upper limit so far.

Store the pasted contents into a shell variable (not implemented)

Another solution might be to store the pasted texts into a shell variable (e.g., ble_pasted_1) and insert into the command line a parameter expansion $ble_pasted_1 instead of the raw texts when the length of the pasted contents exceeds some limit value. But again, I don't know the appropriate value for such a limit as it's a matter of degree. Also I'm not sure if it matches the use case of people who want paste large amount of texts. In this way one cannot edit the content of the pasted texts. So I haven't implemented this so far.

Launch real text editor (not implemented)

Or if you want to edit so many lines before executing commands, maybe we can launch a real text editor (written in real programming language but not shell scripts) just like the binding C-x C-e when the length of the pasted texts exceeds some limit value.

I would like to ask for your comments. What is the use case for pasting such a large amount of texts? Is it just a mistake? Or do you want to pass long texts to commands without editing the pasted contents? Or do you want to edit the pasted contents? Do you have any other ideas to deal with this issue?

from ble.sh.

dylankb commented on May 20, 2024

Thanks for the response!

If you just mistakenly pasted the large amount of texts and want to cancel the processing, you can press C-\ to abort the process and discard the inputs....bleopt decode_abort_char=INT where INT is an integer in the range 0-31 or 63 (corresponding to [email protected]_ and DEL).

I think I'd like to change the command to be C-c. Whenever I try and find this information through searches I usually come up short, though. Any references you can share for what keys map to what integers?

Or if you want to edit so many lines before executing commands, maybe we can launch a real text editor (written in real programming language but not shell scripts) just like the binding C-x C-e when the length of the pasted texts exceeds some limit value.

This sounds reasonable. Although it's hard to set arbitrary limits because each host is different like you said, maybe if only a few lines are pasted in it's fine and the normal multi-line editor kicks in. Otherwise an editor boots up. I'm not positive if I understand the ble_pasted_1 solution, but I think this sounds better than the other two.

What is the use case for pasting such a large amount of texts?

Usually something has been printed to standard out that I don't really want to run again and I need to reuse the output. For example, I might have performed a curl against an API that returns many lines of JSON that I did not capture in a shell variable and I want to parse, improve the visual formatting, etc. of that output by feeding it into a tool like jq. In that case I might copy the output and paste it in to terminal to be echoed and piped into jq for formatting or processing. Does that make sense?

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you for your feedback! I have implemented the above ideas. Sorry, it took some time to design the detailed behavior, and to implement and test them. But still the performance issue is not completely solved.

Value of bleopt decode_abort_char

Any references you can share for what keys map to what integers?

I think you can see the two leftmost columns in the following table in Wikipedia:

C0 and C1 control codes / Basic ASCII control codes - Wikipedia

For example, if you want to know the code for C-c, you can look for the row with ^C (in the first column) to find its decimal representation is 3 (in the second columns). So the settings is

bleopt decode_abort_char=3

By the way I found that this decode_abort_char does not work for some terminals, so I added a fix ad98416.

Upper limit & truncate / discard / editor

This sounds reasonable.

I implemented the setting of upper limit and the settings to control the behavior on the case the command line length exceeds the limit. The following options are newly implemented 2f9a000:

Please refer to each link for details. The upper limit is turned off by default for now. For example, if you want to switch to vim when the number of characters of the command line is more than 2000, you can write the following setting:

bleopt editor=vim
bleopt line_limit_type=editor line_limit_length=2000

However the performance issue has not yet been solved even if a real text editor is used. ble.sh needs to pass the command line contents to the text editor, so ble.sh needs to construct the command line contents before launching the text editor, which takes an unreasonably long time. I think I have to figure out why it takes so much time and investigate whether there is a solution.

from ble.sh.

dylankb commented on May 20, 2024

Value of bleopt decode_abort_char

Thanks! Tested and this works great.

Upper limit & truncate / discard / editor
For example, if you want to switch to vim when the number of characters of the command line is more than 2000, you can write the following setting:

Tested this out and it works as well. The performance issue still does seem to be pretty prohibitive, so I'm interested to hear what you come up with. Feel free to close this issue or leave it open.

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you for testing!

The performance issue still does seem to be pretty prohibitive,

I added another optimization 0d9d867.

I've noticed if it's around 2000 it becomes very difficult to tell if the terminal is processing the input at all.

For this one, I also added progress bars for the input reading phase and character insertion phase. Now there are four progress bars corresponding to the four phases: violet, blue, green and pink phases.

There's a quick "Pasting" message and then nothing quite a long time.

Maybe the bottleneck phase depends on the environment. When I pasted 2000 lines (obtained by repeating JSON from your link four times) in my environment, each phase equally took about ten seconds with the latest ble.sh (see below).

Actually, it should depend on whether your terminal supports Bracketed Paste Mode (Mode ?2004) or not. If your terminal doesn't support Bracketed Paste Mode, the green phase will take much more time and there will be no pink phase. Which phase is your bottleneck?

But I noticed that if one intentionally pastes a large amount of text, actually one could launch the text editor by C-x C-e before pasting the text. So now I'm kind of thinking this issue is not so serious.

from ble.sh.

dylankb commented on May 20, 2024

Cool! I did notice faster paste performance after pulling in the latest update.

I tried pasting the 500~ lines in the link I sent you in iTerm2 and kitty. Apparently kitty is more performant than iTerm2, but I didn't notice much of a difference in this case. I confirmed that my iTerm paste settings do have bracketed paste selected. Here are the approximate stages of the paste job:

Stages 1 - 3 were 5~ seconds each
Stage 4 ("constructing text") took about 10 seconds.

I tested out 2k lines in iTerm

Stages 1 - 3 were 15~ seconds each
Stage 4 ("constructing text") more than 90 seconds. I stopped counting.

Maybe the bottleneck phase depends on the environment.

Yes, that's probably true. Here's some hardware specs for my machine in case it's helpful.

  Processor Name:	Intel Core i9
  Processor Speed:	2.9 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB

But I noticed that if one intentionally pastes a large amount of text, actually one could launch the text editor by C-x C-e before pasting the text. So now I'm kind of thinking this issue is not so serious.

Huh, yeah that's a good point. That's probably the workflow I'll use if I'm pasting in a large amount of text and can remember to do so. There may be some small hacks to speed up detecting whether the line_limit_length was met and pasting in an editor, like maybe putting on the amounts of bytes for line_limit_length rather than characters so you could throw a switch to dump into the editor if the amount of bytes read in passes a certain limit. This way you wouldn't have to go through the decode stage at least.

Small aside question. Some paste operations out there are very quick - say opening a text editor with C-x C-e and pasting in text like you mentioned or git opening an editor and filling it with prefilled content (e.g. git config --global core.editor vim; git commit). Out of curiousity, if all paste operations in ble.sh opened an editor by default would that be as fast as a user doing C-x C-e and pasting in text manually, or would there still need to be some additional processing time?

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you very much for providing me the results of such detailed measurements!

Stages 1 - 3 were 5~ seconds each
Stage 4 ("constructing text") took about 10 seconds.
Stages 1 - 3 were 15~ seconds each
Stage 4 ("constructing text") more than 90 seconds. I stopped counting.

It's interesting to see that stage 4 suddenly becomes slower when one goes to 2k lines from 500 lines. Because you are seeing Stage 4, the Bracketed Paste Mode is turned on. In this case, Stage 4 performs the conversion of Unicode code points to strings which are all performed in Bash process. So the terminal is irrelevant for Stage 4.

Yes, that's probably true. Here's some hardware specs for my machine in case it's helpful.

Hm, it seems like the cache effect (though my guess for performance is usually incorrect...). Usually, the sudden change of the performance is a signal of a cache overflow. Also, the L2 cache size is four times different in your environment and my environment (see below). The size of 2k-line test data was about 53kB, which results in 53k elements of an array in ble.sh which would be the same order with these L2 cache sizes. In addition, maybe heap algorithm of the standard C library in macOS and GNU/Linux can also cause a difference in the cache hit rate.

Property	Value
OS kernel	Linux 5.1.20-300.fc30.x86_64
Number of processors	1
Model name	Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
L1 / L2 / L3 cache size	256 kB / 1024 kB / 8192 kB
Number of cores	4x2 (HT)
MemTotal	8070336 kB

maybe putting on the amounts of bytes for line_limit_length rather than ... This way you wouldn't have to go through the decode stage at least.

Yes, it should have been more efficient if it was possible. Actually, this is the difficult point. The problem is that the input stream from the terminal is a mixture of pasted texts and terminal sequences, so someone has to decode the input stream and detect the end of paste. Generally, the program which detected the start of paste takes care of the detection of the end of paste. Unless there is a text editor which can start in the middle of the paste and only handles the end of the paste, ble.sh which detected the start of the paste needs to take care until the end of paste unfortunately.

Out of curiousity, if all paste operations in ble.sh opened an editor by default would that be as fast as a user doing C-x C-e and pasting in text manually, or would there still need to be some additional processing time?

For the same reason with the previous paragraph, once ble.sh detected the start of paste, ble.sh needs to take care until the end of paste. And it is impossible to predict the start of paste without detecting the start of paste in the input stream.

But, even if ble.sh needs to take care until the end, it is in principle possible to get a performance of the text construction with a speed of the same order with real text editors if I don't care about the maintainability of ble.sh. It's a matter of trade-off: if we want more performance, we need to give up the organized structure of ble.sh and add many exceptional and ad hoc treatment here and there in the codebase. If we want to organize the ble.sh codebase to be maintainable, we need to give up the performance. Now I don't think it's worth persuing the performance at the price of the current structure of ble.sh.

It is difficult to reach a performance similar to real text editors, but nevertheless I think it is still possible to improve the performance several times keeping the current structure of ble.sh. You gave me a big hint on the bottleneck. I'm going to think about it...

from ble.sh.

akinomyoga commented on May 20, 2024

I added several optimizations 3f33dab. I think the performance has been significantly improved by this commit (yet it decodes the input and constructs texts inside ble.sh).

from ble.sh.

dylankb commented on May 20, 2024

That's great stuff! I'd say since opening this thread the time it takes to perform one of these large copy paste jobs has been reduced by about 1/2.

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you for all your help! Actually I was thinking of further optimizing the processing by specially bypassing the decode/encode phases for bracketed pastes, but it needs refactoring. Maybe I will implement it someday but not now. Anyway, thank you!

from ble.sh.

akinomyoga commented on May 20, 2024

Thank you for the information! Yeah, that is one way to circumvent the problem.

from ble.sh.

dylankb commented on May 20, 2024

On a related topic: I don't often need to edit text before pasting, but if I do I'm actually a bit more comfortable in a GUI code editor when working with larger amounts of text. Do you think it be difficult to extend the bleopt editor options beyond vim, nano, and emacs? For example, vim is my default editor for Git but Git can read in text written to an editor like VSCode during operations like git rebase, etc.
https://code.visualstudio.com/docs/editor/versioncontrol#_vs-code-as-git-editor

If this sounds feasible/desirable on your end I can create a separate issue/feature request.

from ble.sh.

akinomyoga commented on May 20, 2024

Do you think it be difficult to extend the bleopt editor options beyond vim, nano, and emacs?

Thank you for the suggestion. I think you can basically assign bleopt editor='code --wait' just like the git core.editor setting described in the above link. Does it cause a problem? Which operating system do you use? Now I have tested VSCode (Windows) called from ble.sh in Cygwin environment, and I found that the differences of the filename path between Windows and Cygwin seems to cause the problem, but I'm not sure if the same issue exists in macOS.

from ble.sh.

akinomyoga commented on May 20, 2024

Oh, thank you! I think I should spell out "for example" instead of "e.g.".

from ble.sh.

akinomyoga commented on May 20, 2024

I have updated blerc 21d636a. Thank you!

from ble.sh.

Long processing times when pasting large amounts of text about ble.sh HOT 18 CLOSED

Comments (18)

Abort Processing

Set upper limit (not implemented)

Store the pasted contents into a shell variable (not implemented)

Launch real text editor (not implemented)

Value of bleopt decode_abort_char

Upper limit & truncate / discard / editor

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent