Tested large files for performance about lemminx HOT 21 CLOSED

eclipse commented on May 27, 2024

Tested large files for performance

from lemminx.

Comments (21)

angelozerr commented on May 27, 2024

Have you some trouble with large file?

from lemminx.

fbricon commented on May 27, 2024

not necessarily yet, but we need to think about setting up performance tests, once all the major features are complete

from lemminx.

angelozerr commented on May 27, 2024

not necessarily yet, but we need to think about setting up performance tests, once all the major features are complete

If there is performance problem with big file, I think vscode html language service will have the same problem since XMLDocument is rebuild each time you type some content inside editor (TextDocumentSyncKind.Full).

I think to improve performance, TextDocumentSyncKind.Incremental should be used, but I think it's an hard thing to do.

from lemminx.

fbricon commented on May 27, 2024

@angelozerr whatever performance the vscode html has, since the xml parser starts to deviate from it (in many ways), we might see a different behaviour, better or worse.
I'm not saying we need to improve the performance now, just that we need to keep track of it eventually. What matters now is to provide correct results. Worry about performance later.

from lemminx.

angelozerr commented on May 27, 2024

What matters now is to provide correct results. Worry about performance later.

Totally agree with you. Thanks for creating this issue.

from lemminx.

NikolasKomonen commented on May 27, 2024

After testing largeFile.txt
I deleted the 2nd last tag </a> in Intellij it took ~2.8 seconds and in lsp4xml ~4.7 seconds till a missing end tag response was received.

from lemminx.

fbricon commented on May 27, 2024

15000L is not what I call large :-) I was thinking about 10's of MB (even that's not large, depending on the context)
Anyways, I think we're getting murdered by TextDocumentSyncKind.Full, as we're sending the full document over the connection on each document change, the whole document is rebuilt entirely on every keystroke, it's not scaling well.
For now, we'll call it a known limitation, but we'll have to work on improving performance once we get the initial features right.

from lemminx.

angelozerr commented on May 27, 2024

@NikolasKomonen thanks for testing that and attached your large file.

Anyways, I think we're getting murdered by TextDocumentSyncKind.Full, as we're sending the full document over the connection on each document change, the whole document is rebuilt entirely on every keystroke, it's not scaling well.

Indeed I think it can be a problem, and it was my fear, because we need to manage "incremental" parser which is an hard task I think.

BUT I have started to study the problem, and building the XMLDocument directly from the given file takes 1590 ms which is too big. It seems the problem comes from the regexp. I will give you feedback and try to fix the problem.

from lemminx.

angelozerr commented on May 27, 2024

See test at https://github.com/angelozerr/lsp4xml/blob/master/org.eclipse.lsp4xml/src/test/java/org/eclipse/lsp4xml/internal/parser/XMLDocumentTest.java

from lemminx.

angelozerr commented on May 27, 2024

@NikolasKomonen I have improved performance, your large file takes 195 ms instead of 1590 ms!

Please give me feedback.

from lemminx.

fbricon commented on May 27, 2024

@angelozerr awesome work! This makes the extension much snappier!!!
did you use a profiler or just noticed that substring hanging there?
Are there other places where substring/string concatenation might be hurting us?

from lemminx.

angelozerr commented on May 27, 2024

@angelozerr awesome work! This makes the extension much snappier!!!

Glad it pleases you:) I think now we have the same performance than HTML Language Server.

did you use a profiler or just noticed that substring hanging there?

To be honnest with you, I'm not very familiar with profiler, I have just noticed this hang (at first I though it was because of regexp, but it was about String#substring)

Are there other places where substring/string concatenation might be hurting us?

I have tried to check that, but I think it's OK. The main problem is to do a substring from a position to string length (it creates a very large String each time, using Matcher#region locate just the matcher. We could use the same Matcher too, but it seems that it doesn't improve performance.

from lemminx.

NikolasKomonen commented on May 27, 2024

@angelozerr This is awesome, great find.

from lemminx.

angelozerr commented on May 27, 2024

Thanks guys!

from lemminx.

fbricon commented on May 27, 2024

Here's a site to find xml documents of various sizes, some pretty big: http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/

Tried with a 30MB doc in vscode. Never reached the point where I got error reported. I'll try with incremental support later

from lemminx.

angelozerr commented on May 27, 2024

Here's a site to find xml documents of various sizes, some pretty big: http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/

Cool!

Tried with a 30MB doc in vscode. Never reached the point where I got error reported. I'll try with incremental support later

Could you give me the link of your xml that you are testing please.

from lemminx.

fbricon commented on May 27, 2024

Seems the server chokes on the nasa.xml from http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html#nasa

Tried with xmx2GB

from lemminx.

angelozerr commented on May 27, 2024

@fbricon I have added the nasa.xml in the test:

for largeFile.xml:

Parsed 'largeFile.xml' with XMLScanner in 31 ms.
Parsed 'largeFile.xml' with XMLParser in 25 ms.

for nasa.xml:

Parsed 'nasa.xml' with XMLParser in 731 ms.
Parsed 'nasa.xml' with XMLScanner in 371 ms.

from lemminx.

fbricon commented on May 27, 2024

try validation, formatting, hover...

from lemminx.

angelozerr commented on May 27, 2024

try validation, formatting, hover...

Yes sure we need too add thoses tests. But if creation of XMLDocument takes 731ms, I think we will have slow problem.

WTP XML Editor cannot open it too your nasa.xml.

I fear that it will very hard to support very large file. I think a problem is because it's not incremental. Have you tried with "experimental" incremental support?

from lemminx.

angelozerr commented on May 27, 2024

I close this issue since 0.8.0 improves performance and memories and gives the capability to disable outline.

from lemminx.

Tested large files for performance about lemminx HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent