Git Product home page Git Product logo

Comments (6)

jbellanca avatar jbellanca commented on May 28, 2024 1

Yeah they're being tricky about it by some sort of filter or something. I went to that example URL above, and using the SaveAllResources plug-in with the option "Include all assets by XHR requests", I was able to save the manifest, but they're definitely doing something to make it harder. For that example, in the "dam-antenati.san.beniculturali" folder in this zip:
https://www.dropbox.com/s/004yjx06omtnf1q/www.antenati.san.beniculturali.it.zip
Btw, thanks for all your hard work on this tool, it's awesome and has been a lifesaver for me in my research!

from antenati.

jbellanca avatar jbellanca commented on May 28, 2024 1

Works perfect again - thanks!!! You're the best!

from antenati.

gmalcolms avatar gmalcolms commented on May 28, 2024 1

Thanks for finding a work-around for their new security. I had made a similar program many years ago in Excel - one that allows you to just list comuni, record types, and years, and the program would go through the website to find each collection before downloading it, without even having to open a browser - but it stopped working since they added password protection to the manifest. Adding the headers as you found still does not work in VBA either, because XHR uses javascript, so I wrote a python program that downloads the html of the manifest given its URL, and then wrapped that in a C++ dll. My Excel program is working perfectly again. Thank you! Now I'm off to download all the record collections in several provinces before they fix the current security hole.

from antenati.

gcerretani avatar gcerretani commented on May 28, 2024

Very curious this effort to make harder our job. I'll have a look at it, we have to figure out why manifest URL return a 403. Few months ago they added geographic and user agent filters. Let's guess what's going on now.

from antenati.

gcerretani avatar gcerretani commented on May 28, 2024

I compared the headers sent by the browser during a standard session, with the headers sent by entering the manifest URL directly in the URL bar. The main difference was the presence of these two headers in the standard case:

	referer: https://www.antenati.san.beniculturali.it/
	origin: https://www.antenati.san.beniculturali.it

Seems that "referer" is enough to fix this issue. Thanks a lot, @jbellanca.

from antenati.

gcerretani avatar gcerretani commented on May 28, 2024

Interestingly, User-Agent seems not needed anymore. I've kept it, adding also Origin, just in case new filters are added in the future. See fa965f2

from antenati.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.