If I feed in a sitemap.xml link into abot the parsedlinks is null. Now alot of web

Sitemap.xml parsedlinks is empty about abot HOT 5 CLOSED

JoshTango commented on June 7, 2024

Sitemap.xml parsedlinks is empty

from abot.

Comments (5)

sjdirect commented on June 7, 2024

Have you tried creating your own IHyperLinkParser or Extending the AnglesharpHyperlinkParser to implement this logic. Wouldn't be hard to do. You would also need to change the following to make sure it would download the content of the sitemap url...

        config.DownloadableContentTypes = "text/html, application/xml";

from abot.

JoshTango commented on June 7, 2024

I might one day.
but the sitemap.xml is suck a generalized standard thing these days I thought you might want to build it in to Abot

from abot.

winzig commented on June 7, 2024

Abot doesn't use sitemaps to help discover pages to crawl?

from abot.

sjdirect commented on June 7, 2024

It's default behavior is to crawl the site based on real navigate-able links. The sitemap can be completely out of sync with the real site so was never part of the original design. However, you can implement your own IHyperLinkParser like mentioned above that will use the sitemap.

from abot.

winzig commented on June 7, 2024

In my experience, we have used sitemaps extensively to help search engines index pages of our sites that they may otherwise have trouble finding. So yeah we'll have to implement this internally I guess.

from abot.

Sitemap.xml parsedlinks is empty about abot HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent