Comments (4)
Personally, I use Modest, because I need to work with arbitrary pages from different websites.
If you scrape specific websites and can validate the results, I think it's safe to use lexbor.
The reason I don't use lexbor yet is because it's not as tested as Modest (by the users).
from selectolax.
Both engines will be supported in selectolax for at least a few years
from selectolax.
Thanks @rushter for quick replay :)
I have one more question. Since @lexborisov suggests to use Lexbor over the Modest, do you also suggest the same for selectolax? I mean to use lexbor engine for selectolax over the modest engine? In documentation you mention that lexbor engine is the same as modest. However, lexbor lacks some features.
from selectolax.
Thanks a lot. That's the answer I was looking for. I have thousands of pages to scrape not knowing their structure in advance. Greatly appreciate your work.
from selectolax.
Related Issues (20)
- Node.child should be named Node.first_child ? HOT 2
- Awful text parsing issue HOT 6
- Release wheel for python 3.12 HOT 5
- Tags out of order in returned list when using css to specify multiple tags HOT 5
- What is/was the format for the pages/pages.json file? HOT 1
- HTMLParser and LexborHTMLParser search differently HOT 1
- css_matches of LexborHTMLParser does not free memory HOT 2
- [Typing] `_Attributes` in .pyi stub file is missing dictionary methods like `__getitem__`
- Selectolax couldn't load large html string (87MB) but lxml could HOT 3
- I am still getting this error even with the update - not able to load large html contents HOT 1
- Error in LexborHTMLParser HOT 7
- Memory leak HOT 3
- Memory leak when using LexborHTMLParser HOT 1
- Segmentation fault with Lexbor engine HOT 2
- Allow regular expressions in `text_contains` / `any_text_contains` HOT 2
- Adding AdvancedHTMLParser to benchmark HOT 2
- Weird issue in rendering HTML HOT 4
- Cannot import name modest HOT 1
- ModuleNotFoundError: No module named 'selectolax.parser'; 'selectolax' is not a package HOT 1
- Best way to handle content not found? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from selectolax.