Git Product home page Git Product logo

datdot-node-rust's People

Contributors

andresilva avatar arkpar avatar bkchr avatar cheme avatar chevdor avatar cmichi avatar demi-marie avatar expenses avatar gabreal avatar gavofyork avatar gnunicorn avatar jimpo avatar joshorndorff avatar kianenigma avatar kigawas avatar ltfschoen avatar marcio-diaz avatar mxinden avatar nikvolf avatar pepyakin avatar rphmeier avatar shawntabrizi avatar sorpaas avatar stanislav-tkach avatar svyatonik avatar thiolliere avatar tomaka avatar tomusdrw avatar tripleight avatar xlc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

datdot-node-rust's Issues

bit shuffleing

@todo

  • scrub out all the Metadata, and reverse the bits of just the numbers on a array of indeterminate length

the issue is we have SCALE-encoded bits that have a tiny bit of Metadata and all numbers are LE

Issues from running substrate & service

newPin event

It seems like it doesn't log the correct user nor hypercore address. I get all event logs look the same (same user + same key (not matching the actual hypercore key)). The user account logged in the event seem to be always the one that gets finalized first (we're also creating users)

VIDEO 1 (watch first) https://www.loom.com/share/78305a4036804a14af3f67fa0c36e2f0


VIDEO 2 (watch next) https://www.loom.com/share/4bb8ab5e497c47ceb7b09734122d2688

Merkle Tree Proof deep dive.

The merkle tree (and proofs) expected in the dat_verify.rs pallet should match the tree used in hypercore-crypto/hypercore with the exception that the merkle root passed to substrate is the checksum used to calculate the signature, not the roots used to calculate it.

How it's verified in datdot-substrate

Here is the function signature of the submit_proof function:

https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L490

where the Proof type is a struct defined here:
https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L192-L196

this should match the [merkle proofs returned by hypercore]:(https://github.com/mafintosh/hypercore/blob/1082cc5f8803f5bce65686f799784920d1426088/index.js#L537)

First we verify that the proof is being submitted by the correct user:

https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L493-L496

(I am considering removing this check)

We verify that the signature provided matches the merkle root (checksum) provided and is signed by the public key associated with the challenge (currently PUBLISHER, should be ENCODER):

https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L503-L509

We verify the chunk hash matches the chunk hash provided in the Proof by recalculating it and getting the node with the index of the chunk from the proof:

https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L520-L543

finally, based on the index being proved, we calculate the merkle roots (using a hacky linear-time calculation to get the expected indeces the roots should contain), and use them to rebuild the merkle root checksum:

https://github.com/playproject-io/datdot-substrate/blob/ac0e44e02c34c454c7bda58eee855de2054e34a4/bin/node/runtime/src/dat_verify.rs#L544-L568

There is currently an oversight in the lack of verification of intermediary nodes of the merkle path - this would be the final step.

substrate encodings

@TODO

  • SUBSTRATE ENCODING
    hash_number - should be named: index
    hash - should be named: hash.data
    total_length - should be named: size
  • get rid of the field hashtype entirely and just have it internal to substrate, because it can be calculated/derived from other params
    // hash of the parent when used in calculation of the root hash doesn't contain hashtype
    // => because that's part of the hash payload in the first place
    // ==> hardcode hashtype to 2
    // ==> hashtype of the merkle root is always 2

[runtime/idea] take advantage of instantiable modules.

I'm considering implementing #6 by turning dat_verify into an instantiable module, and making each "parallel" track a separate instance of the module - they would all call into a separate "dat_store" module.

It may also be a good idea to have the different challenge types be in different modules (maybe sharing a challenge trait).

I need to think about the internal api here though so it isn't too complex.

Phase 1 logic

@todo

  • rename stuff (see parameter naming below)
  • change stuff (see change proposals below)
  • other fixes (see #23 )

1. publisher registers new data

we get merkle root like this https://pastebin.com/QH7egWUX and then submit it to the chain

await API.tx.datVerify.registerData(merkleRoot)

2. chain emits SomethingStored event

3. after the event is emitted, users can register for different roles: encoder, hoster, attestor

await API.tx.datVerify.registerEncoder()
await API.tx.datVerify.registerSeeder()
await API.tx.datVerify.registerAttestor()

4. When users registers for any of the roles, chain will check if there is any data that needs hosting and if there is at least one hoster, one encoder and one attestor, then 'New Pin' is emitted

5. new event is emitted (NewPin) where encoder and hoster are notified about what feed needs hosting/encoding

const [encoderID, hosterID, datID] = event.data

6. we pair hoster and encoder: encoder compresses data and passes them over to hoster

7. when encoder finishes its job, it notifies the chain (registerEncoding)

const args = [hosterID, datID, start, end]
// if more ranges, send same tx for each range separately

8. when hoster gets all the data, it also notifies the chain (confirmHosting)

await API.tx.datVerify.confirmHosting(datID, index) // index = HostedMap.encoded[index] (get encoderID and then loop through to get position in the array)

9. chain emits event: HostingStarted

const data = event.data // [hosterID, datID]

10. Publisher can now submitChallenge

await API.tx.datVerify.submitChallenge(hosterID, datID)

11. Challenge event is emitted where hoster is notified about the challenges

const [hosterKey, datKey] = event.data
// hostedMap => see which chunks are hosted by this hoster for this key (Rust API line 317)

12. Hoster submits proof to the chain

//challengeID no longer exists -> use [datID, chunk] directly
await API.tx.datVerify.submitProof(challengeID, []) //challenge ID is parsed

13. If challenges are successful, chain emits new event: AttestPhase

const [challengeID, expectedAttestors] = event.data
const attestorIDs = JSON.parse(expectedAttestors).expected_attestors

// change proposal:
// could we just pass an array of attestors instead of an object
const [challengeID, attestorIDs] = event.data

14. random attester is selected to go grab data from the swarm

  • currently we just hardcode the response, so no attester actually goes to the swarm

15. Attester reports back to the chain (submitAttestation)

  function getAttestation () {
    const location = 0
    const latency = 0
    return [location, latency]
  }
const attestation = await getAttestation()
//challengeID no longer exists -> use [hosterId, datID, chunk] directly
const submit = await API.tx.datVerify.submitAttestation(challengeID, attestation)

changes to the chain to match mauve's logic (mvp)

  • register a role (=offer the service) (Encoder, Attestor, Hoster) => list for each role
  • publishData

Problem:

  • registerHoster has the logic to assign the hypercore to the newly registered Hoster -> but how do we now also asign the hypercore to the encoder??
  • we should instead emit event (selected encoders & selected hosters)

Solution?

  • additional logic in RegisterHoster/Register Encoder/Publish data
  • each hypercore would have a count of how many hosters/encoder are available
  • when new registerHoster or registerEncoder we check if count_Encoder >= min && count_Hoster >= min (min = 3) => then emit event (newHostingRequest)
  • newHostingRequest emits 3 selected hosters and 3 selected encoders

  • hoster triggers hosterReady(feedID, hoster pubKey) function
  • chain emits event hosterReady
  • publisher triggers Start a Challenge

Unexpected epoch change

I noticed my chain starts to behave weird if I let it run for a long time (might be I send to many transactions triggering same function with same account too many times)
Screenshot_2019-12-27_23-46-53

Phase 1 -July - errors

Error 3

Update

Date: July 8

Not fixed yet: I added a fix to the lib.rs (created contract_id, inserted contract to GetContractByID and emitted an event). Event now does log, but there's a new error (maybe related to me not fixing this in the correct way)

Screenshot_2020-07-07_03-06-47


Date: July 8

Scenario:
alice([user, publisher])
bob([user, hoster, attestor])
charlie([user, hoster, attestor])
dave([user, hoster, encoder])
eve([user, encoder, attestor])

Error: NewContract event doesn't get emited

Output:
Screenshot_2020-07-07_01-55-03

Error 2

Update

Date: July 8

Fix: Had to change the data type for NoiseKey in lib.rs to H512 and it works. Updates types and also agreggated new types locally. Tried also with Public type but it didn't work. I guess because the value of the NoiseKey is <Buffer 26 6f 1c df f6 c0 e6 98 c9 36 60 8f 50 b4 8d ad a4 53 82 1f 5c 46 9c 9d b5 6a bc 91 a1 47 54 3b, secure: true>

Logs after the fix and a rebuild
Screenshot_2020-07-07_01-39-43


Date: July 8

Scenario:
alice([user, publisher])
bob([user, hoster, attestor])
charlie([user, hoster, attestor])
dave([user, hoster, encoder])
eve([user, encoder, attestor])

Error: Struct: failed on 'noise_key'

Output:
Screenshot_2020-07-06_23-57-35

Error 1

Update:

Date: July 8

Fix: Error was related to the nonce we were passing => need to decide where nonce is created (locally and passed with the tx or on chain).

Screenshot_2020-07-06_23-53-41


Date: July 7

Scenario:
log('start scenario')
alice([user, hoster])
bob([user, hoster])

Output:
Screenshot_2020-07-06_23-20-03


Date: July 7

Scenario:
log('start scenario')
alice([user])
bob([user, publisher])

Output:
Screenshot_2020-07-06_23-24-19

CI

set up CI via github actions or other suitable framework

  • Tests #29
  • Builds
  • Wasm Builds #28

roadmapping

@todo


Wasm Builds

Build node binaries to a wasm blob importable as a js lib

What is this?

I saw this project was part of wave 4, it seems really interesting - similar to things we have thought of building at http://github.com/joystream , but documentation is quite lacking.

What is the most complete explanation of the goal here?

make lab environment to reliably test substrate chain scenarios

@todo

  • turn little tests and experiments which involve multiple "nodes" into proper tests inside a lab

TESTING STRATEGY

  1. pull real data from dat to have test data
  2. submit it to substrate
  3. to verify

TESTING

  • set up testing environment to spin up 1+ tests and run them

  • GOAL put our "specification" into code

  • STRATEGY:

    • use transaction-factory in substrate
    • use maybe tape to write node tests
    • write everything as "integration tests" from JS side

CURRENT PRACTICE:

  • you usually don't spin up an entire node to run tests,
    just the parts of the node you need (storage, wasm executor, etc.)and then call into their apis

GOAL:

  • how to spawn 1+ datdot nodes
    • => execute transactions in controleld way
    • => check if everything went as expected
    • => and is repeatable
  1. have integration tests where we can spin up

  2. multiple substrate nodes on different platforms and see if they

  3. connect to each other and work in expected ways.

  4. If people will install datdot on windows, linux, macosx

  5. some maybe rip out the internals to run it on servers

  • and avoid the electron app
  1. ...and who knows if somebody makes it work on mobile OS's

GOAL:

  1. so, having a way to spin up all those different OS in the cloud
  2. and install datdot and connect and test to reproduce problems which might occur in practice later on

TEST SCENARIO 0

  1. use forceRegistering to check basic logic

TEST SCENARIO 1

  1. register seeders
  2. register data
  3. log usersStorage and datHosters

JOSHUAS REMARK:
just grepping the codebase for node_testing to see how it's being used
but there are some really good benchmarks and tests in node/executor
I think I'll adapt them for our module and just use that
just discovered they can give is the concrete block size info you wanted

JOSHUAS COMMENT: (NOVEMBER 13th)
there are some testing scaffolding and you can test runtime functions individually,
for example writing integration tests by calling RPC via rust,
but presetting state and simulating interactions is not super easy.

  • there are no network testing tools to easily writing some scenarios and executing a quick simulation like:
    • spawn a bunch of substrate nodes
    • pre-configure state
    • and then execute a fixed set of transactions to simulate what happens
    • to then run some assertions against the state after the state transition

Updates/Bug fixes

  • fix problems with PublishData (not working atm)
    When I run PublishData, I don't get any error, chain logs this and it also freezes (ctrl C doesn't work, can't stop the process)
    Screenshot_2020-06-14_00-08-53

  • add HostingStarted event after confirmHosting is triggered

  • add ability to get archive index from the feed key

const archiveIndex = api.query.dat_verify.arhiveIndices(dat_pubkey)

Basic Calls Flow

(Current Runtime as of 24/12/19)
Source code: https://github.com/playproject-io/datdot-substrate/blob/master/bin/node/runtime/src/dat_verify.rs

State:

No Seeders, No registered Dats

As a Dat publisher: call register_data to register the current state of your archive on-chain.
As a seeder: call register_seeder to register your intention to seed a dat. watch for the NewPin event, which will tell you which dat archive you should be pinning. You can also query UsersStorage(AccountId) to see all dats you should be pinning simultaneously.
Every Block after there are registered seeders with Pinned dats, a Challenge event will be emitted by the chain.

State:

Seeder and Dat registered successfully

As a Dat publisher: no action required. As a seeder: watch for Challenge events to your AccountId

  • Challenges are (AccountId, BlockNumber) tuples - where AccountId is the selected seeder, and Blocknumber is the deadline. (All active challenges are enumerated in the ChallengeMap linked_map storage item - (x, y) where x refers to the index of a challenge in the SelectedChallenges map, and y refers to the challenged seeder in the SelectedUsers map.) for reference, SelectedUserIndex is a mapping from AccountId -> (index, challenge count)
    If you have an active challenge as a seeder, you clear it by calling submitProof with the proof for the requested chunk (chunk index retrievable via the SelectedChallenges map).
    If a challenge has not been successfully cleared by the end of the deadline block, it should emit a ChallengeFailed event and punish the failed seeder.

[runtime] refactor datdot runtime modules into frame pallets

leaving this for milestone 2:
after we have a working runtime, I need to do the minor refactoring required to move the modules into a pallet - then we can also get reorganize the repo so we aren't a substrate for (currently this is the case because it's easier to keep up with upstream this way, but after substrate 2.0 stabilizes we probably wont need to keep it that way.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.