Git Product home page Git Product logo

Comments (4)

tlasica avatar tlasica commented on June 10, 2024

I see two approaches possible:
a) using smart structure instead of ArrayBuffer[Int] to do dedup during adding neighbours
b) do post processing for sorting or duplicate elimination

I assume that the signature of below function should remain unchanged to makes things easier

def readEdgesBySource(): (Int2ObjectMap[ArrayBuffer[Int]], Int2IntArrayMap) = {

Solution (a) introduces additional memory overhead for keeping both set and list in case of unsorted with duplicates elimination plus some extra memory copying at the end of the processing. It should come in zero cost if neither deduplication nor sorting is applied.

Solution (b) can do the postprocessing on each value in the edgesBySource map. It will cost additional memory per neighbours list (much less than solution (a)). Memory copying will happen also in this case for some cases e.g. dedup with original order.

Time complexity ibn both cases is similar of n * O(klogk) for sorting (when n is number of nodes and k is #of neighbours). Duplicate elimination in both cases is feasible in additional O(m) cost (m = #edges).

Do you have any suggestions which approach to follow?
I have (a) implemented and tested - as it seemed to be more elegant one :-)

from cassovary.

pankajgupta avatar pankajgupta commented on June 10, 2024

IIUC you are saying that (a) requires at least 2x memory in your implementation as you would keep both a set and a list? If so, that is just too much. You could keep only a SortedSet in (a) and then convert to Array at the end. Alternatively, post-processing as in (b) looks good too.

from cassovary.

tlasica avatar tlasica commented on June 10, 2024

Yes, in solution (a) there is twice as much memory used. Unfortunately it is impossible to use only SortedSet(a) if you want to keep the order of neighbours as given in the file. It is possible if we decide that deduplication=true and sort=false is not acceptable.

from cassovary.

pankajgupta avatar pankajgupta commented on June 10, 2024

Makes sense -- (b) sounds more flexible then, doesn't it?

On Wed, Jul 15, 2015 at 8:23 PM, Tomek Łasica [email protected]
wrote:

Yes, in solution (a) there is twice as much memory used. Unfortunately it
is impossible to use only SortedSet(a) if you want to keep the order of
neighbours as given in the file. It is possible if we decide that
deduplication=true and sort=false is not acceptable.

β€”
Reply to this email directly or view it on GitHub
#185 (comment).

from cassovary.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.