Git Product home page Git Product logo

Comments (4)

mauriciocolli avatar mauriciocolli commented on August 16, 2024

@theScrabi I was thinking, does the page number in channel extractor make sense?

Because there some bad things that it brings (despite that currently, it doesn't work at all):

  • You don't know how many pages there are
  • Some sites like YouTube doesn't have pages (I don't know one that have), it works as the user demand more videos (scroll all the way down)
  • I don't see a very clear use case for page number, maybe continue where you stop (but anyway, that's more complex/difficult to implement in services that don't have the concepts of pages, because you would have to keep loading the on demand urls, until you get to the "page")

So, wouldn't make more sense implement it that way, on demand?

from newpipeextractor.

theScrabi avatar theScrabi commented on August 16, 2024

Introduction to the problem

That page thing is indead a problem. I've tried to implement the NewPipe backend like a slingle state machine, like regular REST Api are set up. However a Website is not a single state machine, as you can see on this example: Making one and the same request will give you a different output.
This might be a flow of NewPipe, and it might also cause some trouble later on when we implement different services.

Concrete Problem

At the moment it works like this: Every request is a new Instance of the back end. So the back end does not remember anything from the last request (it does not remember it's state). This is why the page parameter has to be given to the back end, so it's possible for the front end to remember the state.

In our case the data send from youtube is not the same for every url (one time its html one time it's ajax with less information). That means that NewPipe has to simulate a stateless machine, and that's bad you are right.

What we need

Well I don't want to say it but It might be that the backed as it is designed at the moment will cause trouble. The extractor/collector thing

  1. is bad for parallelization
  2. requires all information to be given at every point of time

Solution

I see two solutions:

  1. Make an instance of the back end that lives as long as the channel/video/bla is displayed
  2. Do not collect all the data, but make the frontent reside what it needs.

What do you think?

from newpipeextractor.

mauriciocolli avatar mauriciocolli commented on August 16, 2024

I think the instance living as long as it's used, is good for me (at least for now, I'll think more about it).

And I'm doing a big refactor in the extractor (I implemented the "on-demand", as I discussed in the previous post), take a look at: https://github.com/mauriciocolli/NewPipeExtractor/tree/refactor-extractor

Also I want to implement the playlist asap too, but I want to, at least, get rid of these issues, as it suffers the exact same problem as the ChannelExtractor (copied the same code).

PS: Please take a look at those open pull requests.

from newpipeextractor.

theScrabi avatar theScrabi commented on August 16, 2024

"On-demand" means you are not pulling out all the data into a struct, but calling the required function when data is needed right?

from newpipeextractor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.