Git Product home page Git Product logo

Comments (9)

Ilapides avatar Ilapides commented on May 21, 2024

This is a big issue for me. Is there some way to route TumblThree through a personal OAuth token?

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

I've written a bit here about it. The newer Tumblr API v2 works with OAuth. The Code section in question would be in here.

I'll probably have some time over the next weeks too, so I might look into it as well. But I've never used OAuth, so I'm new to it as well. Maybe there is already a official or similar implementation to look at around.

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

Check out this wiki page. Its certainly far from the optimal solution, but everything we have so far.

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

Well, that didn't turn out to be a good idea.

It's probably easier than I thought to do it the manual way and parse the whole website using a library like Html Agility Pack or AngleSharp. We could use the internal .NET browser to open the login page and use the cookie container to grab all the cookies and than use them within the WebRequests and parse the results with Html Agility Pack/AngleSharp.

It's probably worth a try and I'd try it out if no one else is willing to once I've some more spare time again.

from tumblthree.

kingbode avatar kingbode commented on May 21, 2024

I already worked on a code for a WebSite that require login to download pages from it correctly, and it worked fine with me using cookie container and WebClient to download the pages and parse them and get images URLs and download them. as below, the issue is that looks like custom to this website , any be there are other ways which is easier and should be formalized on universal process to fit for all websites, or you can use the option login manually and open the WebBrowser control and do all your work from within it ,after you login to your site , NewDownloader application is using the same way, but IDM is more smarter and I think it has the super process to achieve this

`
CookieAwareWebClient client = new CookieAwareWebClient();

public Boolean LogintoThePage(CookieAwareWebClient client)
{

       client.Encoding = Encoding.UTF8;

        var loginData = new NameValueCollection();
        loginData.Add("login", "YourUserName");
        loginData.Add("password", "YourPassWord");
        loginData.Add("submit", "Войти");
        client.UploadValues("http://gde-fon.com/user/login", loginData);


        return IloggedSuccessfully = true;

    }

`

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

So, we could also use a WebDriver, which essentially is a chrome/firefox without the UI. That would also simplify the private blog access. It probably uses lots of memory and fattens the application a lot, but it sure is the most simple implementation for now.

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

I've actually played around with these several times. You need ajax/XMLHttpRequest in a POST request from what I've seen in the Chrome developer tools for the pagination (i.e. get the next images). But I couldn't figure out how to properly do this within C#'s HttpWebRequest and what Headers/cookies need to be send with, how to wait for the answer, etc. Does this even work without any javascript?

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

Okay, I've figured it out. It will probably work with the upcoming safe mode too, and we might even drop the whole api. Would be beautiful ..

from tumblthree.

johanneszab avatar johanneszab commented on May 21, 2024

This release can download private blogs that require a login. It also sends a flag in every request to get nsfw content. For this to function it uses a different service from the tumblr.com network and a cookie for authentication.

Until this part of this branch has been included into the main branch it will stay here as a separate download. It would be nice to have a proper detection that switched between the different crawl methods depending on its requirement. Say you don't want to download a private/explicit blog, than you don't have to login. If the request fails however, try again and use the login/cookie.

  • Should allow to bypass the safe mode and download nsfw/explict content after July 5.
  • Allows to download private blogs (#4).
  • Can grab more meta information. Someone has to parse the proper parts. I'll probably not do this.
  • You must login in order to do anything unlike previous releases. Even for adding a blog. Go to Settings->Authenticate and fill in your username / password in the Internet Explorer window. It will automatically close after the login process and store the cookie. Alternatively, you could also open the Internet Explorer and login there.
  • You can increase/remove the rate limit for crawling the website since it doesn't use the Tumblr api. But I personally wouldn't without a solid reason.
  • The maximum posts per pages are now 100 posts instead of 50 as it used to be with the Tumblr api. So, I would increase the post per page count as it reduces the amount of connections necessary to crawl the blog.
  • Pre-release for a reason: It's not well tested at all. I personally would use the latest stable release, unless you want to download a private blog, or if the stable release of TumblThree will not work anymore after July 5.

TumblThree-v1.0.7.55-Application.zip
TumblThree-v1.0.7.55-Translations.zip

from tumblthree.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.