Git Product home page Git Product logo

Comments (14)

muyangye avatar muyangye commented on July 20, 2024 2

Hey @muyangye,

thank you very much for bringing your ideas in here!

I really like the idea of providing the option to delete the existing file. As you already said, we would need to check carefully if this file is used anywhere. In addition, access to files is exposed via the API so someone might rely on this file which we can't detect. But for this case, I would consider a friendly reminder in the UI as sufficient.

If we want to add the deletion feature (and I don't see a real show-stopper here), I'd propose doing it as a second step. This way the changes are smaller and more comprehensible.

In case you are interested in contributing here, this would be awesome 🤩

Nice idea! I also agree that we should implement step by step. And yes I am willing to contribute. But I may or may not be available until December. If time isn't too much of an issue, let me take care of it!

from streampipes.

muyangye avatar muyangye commented on July 20, 2024 2

I like the proposal of #2300. How do want to handle internalFilename and originalFilename in this case? I agree that #2300 is a good compromise between this feature and stability for the end user. In the middle term, I'd like to remove the distinction code-wise as well.

#2300 is just a temporary PR so that everyone can see my current approach, for merging the internalFilename and originalFilename, I plan to achieve this by the migration too. I am not sure what will happen if I modify the FileMetadata class structure and then commit the update to CouchDB. I will check it out later. I propose #2300 first because

  1. That at least allows users to see no duplicate originalFilename and the REST and UI change are necessary even after merging internalFilename and originalFilename.
  2. I will be traveling around in the upcoming 3 weeks. So I might not be able to work until early January. I don't think I can finish the merging before my travel so I submitted #2300 as a temporary solution. I will finish the entire migration later 🥇

from streampipes.

bossenti avatar bossenti commented on July 20, 2024 1

No we are Not in a hurry here
Nice, feel free to take it up whenever you have time for it 🙂

from streampipes.

tenthe avatar tenthe commented on July 20, 2024 1

Hi @muyangye,
thank you for outlining the different options and discussing the pros and cons.
Since they all have a disadvantage, I would suggest a 4th option, with a combination of your solutions.

Perhaps you can provide a migration script (see [1]). With this we can update all entries to the new version. This should (hopefully) solve the backward compatibility issue.

What do you think?

[1] https://github.com/apache/streampipes/tree/dev/streampipes-service-core/src/main/java/org/apache/streampipes/service/core/migrations

from streampipes.

bossenti avatar bossenti commented on July 20, 2024 1

I would also suggest to define a migration as @tenthe proposed.
This should allow us to go with option 1).
We could even fully switch to the new identifier one release later, right?
Then all files should have been migrated and we can safely remove the old identifier.

from streampipes.

bossenti avatar bossenti commented on July 20, 2024 1

@muyangye Thanks a lot for the detailed proposal 🙏

from streampipes.

bossenti avatar bossenti commented on July 20, 2024 1

Sounds like a solid plan!

No worries, just catch up on your great work whenever you find time for it.
There is no need to hurry.

Enjoy your traveling 🏜️

from streampipes.

muyangye avatar muyangye commented on July 20, 2024

Hey @bossenti, I think this is definitely a great improvement for users. It's really annoying to see lots of files with the same name. I think in addition to opening a dialog for users to enter a new name, from a usability and convenience standpoint, we should also allow users to delete the already existing files with the same name (though we need to investigate the effect of this for adapters using the file) via one click. Often times I just want to change one line of data in the file but not rename it. Do you see any implications of deleting the original file?

from streampipes.

bossenti avatar bossenti commented on July 20, 2024

Hey @muyangye,

thank you very much for bringing your ideas in here!

I really like the idea of providing the option to delete the existing file.
As you already said, we would need to check carefully if this file is used anywhere.
In addition, access to files is exposed via the API so someone might rely on this file which we can't detect. But for this case, I would consider a friendly reminder in the UI as sufficient.

If we want to add the deletion feature (and I don't see a real show-stopper here), I'd propose doing it as a second step. This way the changes are smaller and more comprehensible.

In case you are interested in contributing here, this would be awesome 🤩

from streampipes.

muyangye avatar muyangye commented on July 20, 2024

Hi @bossenti @tenthe, just to let you know I am actively working on this issue. I have one question on backward compatibility though:

Currently the FileMetadata class has an internal file name and an original file name and I plan to just merge them into one file name. However, this may cause negative effects on current processes because of backward compatibility. For example, the delete functionality relies on the internal file name. After letting this delete function use the merged single file name, users will not be able to delete the old already existing files because they are not found since they use internal file names.

I propose 3 solutions:

  1. Keep the internal file name, also let fetching/deleting/other processes keep relying on internal file name. Changes are 1) when storing new files, just set internal file name to be the same as uploaded file name instead of random UUID and 2) delete original file name and change all places using it (this is safe because no modification actions rely on original file name, but some import/export/preview that relies on original file name is going to fail) to use the new internal file name. This approach solves the issue of backward compatibility and later uploads will follow the expected behavior outlined in this issue. But this causes the storage to have some (already existing) files' internal file name set to random UUID and some (after this change) files' internal file name set to uploaded file name. Also, for those existing files, users will lose original file names and see a bunch of random UUIDs in the front end which is confusing.
  2. Keep both internal file name, original file name, and don't touch any places using them. Only change is setting internal file name to original file name instead of random UUID. This causes no problems for users but the codebase looks really weird. Later developers may wonder why there's both an internal file name and an original file name since they are always the same (only looking at the code).
  3. Make an announcement to users to let them delete existing files before a release. This is better technically but is going to cause troubles to users.

Let me know which one do you prefer and I am definitely willing to discuss other solutions too!

from streampipes.

muyangye avatar muyangye commented on July 20, 2024

@tenthe @bossenti Ah yes a migration is definitely the solution here! After the migration we can simply use the merged file name as Tim said. I will split my PR into 2 PRs, 1 that contains the migration+modify necessary classes and 1 that contains the upload conflict frontend and additional REST APIs (which I have already done but I will submit it after the first PR).

There is only one last question remaining: during the migration there may be multiple files with the same original file name, but we can only have one such file. Do you have any thoughts how to deal with those files? I was thinking of renaming but I am afraid this will break other processes using the file such as an adapter.

from streampipes.

tenthe avatar tenthe commented on July 20, 2024

That's a good point. I'm not sure what the best option is here that won't break anything.
Maybe we can change the names as long as we keep the internal file ID, but we should check this first.

To do this, I would suggest creating a short list using this endpoint to see how we can deal with the problem of identical names. How do we solve this problem in the current implementation?

(The FileManager is used in the REST endpoint in PipelineElementFile)

from streampipes.

muyangye avatar muyangye commented on July 20, 2024

That's a good point. I'm not sure what the best option is here that won't break anything. Maybe we can change the names as long as we keep the internal file ID, but we should check this first.

To do this, I would suggest creating a short list using this endpoint to see how we can deal with the problem of identical names. How do we solve this problem in the current implementation?

(The FileManager is used in the REST endpoint in PipelineElementFile)

Thanks for the reply! It sounds like we have a little bit different definition of migration here. Before this quoted reply, I thought by migration you mean in FileMetadata class, remove attributes internalFilename and originalFilename, then add a new attribute filename which represents the merged filename. However, after this reply, if I understand correctly, I think you might want to keep internalFilename and originalFilename. Then to avoid users to see/upload files of duplicate originalFilename,

  1. Modify REST endpoints and ui to disallow the upload of files with duplicate originalFilename.
  2. For files already having duplicate originalFilename, initiate a migration to change their originalFilename to something else.

Either way, I think there are a lot of vague terms. Probably code explains better. I’ll submit a PR with my current understanding and we can always iterate on that :-)

from streampipes.

bossenti avatar bossenti commented on July 20, 2024

I like the proposal of #2300. How do want to handle internalFilename and originalFilename in this case? I agree that #2300 is a good compromise between this feature and stability for the end user. In the middle term, I'd like to remove the distinction code-wise as well.

from streampipes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.