filefilego / filefilego Goto Github PK

Decentralized Data Sharing Network - A Peer-to-peer, censorship-resistant, and a privacy-focused data sharing network

License: GNU Lesser General Public License v3.0

Dockerfile 0.03% Go 99.88% Makefile 0.09%

blockchain blockchain-technology cryptocurrency cryptography p2p peer-to-peer censorship-resistance file-sharing decentralized download-manager privacy search-engine storage-engine wallet bitcoin ethereum libp2p proof-of-transfer proof-of-existence proof-of-stake

filefilego's Introduction

FileFileGo Decentralized Storage & Data Sharing Network

The FileFileGo protocol is a peer-to-peer storage & data-sharing network designed for the web3 era, with an incentive mechanism, full-text search and a storage engine. Its decentralized architecture enables users to store and share data without censorship or a single point of failure. By leveraging game-theory concepts, FileFileGo incentivizes participation and ensures data availability while achieving fault-tolerance and preserving privacy.

FileFileGo is an open-source community project, with no centralized control or ownership. Its coin distribution is designed to be fair, with an emission of 40 FFG per block that decreases by half every 24 months. The protocol is launched without ICO/STO/IEO or pre-mine, relying on a Proof of Authority consensus algorithm that will eventually transition to Proof of Stake to allow more stakeholders to participate.

By supporting FileFileGo, users can help promote digital rights, privacy, freedom of information, and net neutrality. We encourage contributions and innovative ideas to ensure that the internet remains an open and decentralized platform.

The Innovation: Proof of Transfer with Zero-Knowledge Proof (ZK-Proof)

Problem

Let us suppose that node_1 needs to download some data_x, owned by node_2, and pay for the fees required by node_2. What happens in the case of Byzantine fault nodes? How do we verify successful data transfer to destination nodes and prevent the following malicious cases:

node_1 is a dishonest node that reports data_x as invalid, to avoid paying the fees.
node_2 is a dishonest node that serves data_y to node_1 and claims that it's data_x.

Solution

The network can resist Byzantine faults if node_x can broadcast (peer-to-peer) a value x, and satisfy the following:

If node_x is an honest node, then all honest nodes agree on the value x.
In any case, all honest nodes agree on the same value y.

The Proof of Transfer mechanism addresses the aforementioned issues by enabling honest nodes in the network to verify and reach consensus on the successful transfer of data_x from node_2 to node_1. This is accomplished through the use of verifiers, which are responsible for challenging participating nodes. While a straightforward approach would involve sending the required data to a verifier and then forwarding it to the destination node, this method can lead to bandwidth and storage bottlenecks, thereby reducing the overall network throughput. Therefore, the Proof of Transfer solution has been designed to minimize the bandwidth and storage/memory requirements associated with this process.

              ┌───────────┐
     ┌────────►[verifiers]◄─────────┐
     │        └───────────┘         │
┌────┴───┐                     ┌────┴───┐
│        │                     │        │
│ node_1 ◄─────────────────────► node_2 │
│        │                     │        │
└────────┘                     ├────────┤
                               │ data_x │
                               └────────┘

Algorithm

Let $x$ be the input file containing content divided into $N = 1024$ segments.

Divide the content of file $x$ into $N$ segments: $x = (x_1, x_2, \ldots, x_N)$
Calculate the Merkle Tree hash of the segments: Let $h(x_i)$ represent the hash of segment $x_i$. Construct the Merkle Tree by hashing adjacent segments in a binary tree structure:

$h(x_i) = \text{HashFunction}(x_i)$ $h(x_{i,j}) = \text{HashFunction}(h(x_i) | h(x_j))$ where $|$ denotes concatenation. The root hash of the Merkle Tree is $h_{\text{root}} = h(x_{1,2})$, representing the overall content.
Shuffle the segments: Let $\pi$ be a permutation representing the shuffling of segments. $\pi : {1, 2, \ldots, N} \rightarrow {1, 2, \ldots, N}$ The shuffled segments are: $x_{\pi(1)}, x_{\pi(2)}, \ldots, x_{\pi(N)}$
Encrypt 1 percent of the shuffled segments: Let $M = \lfloor 0.01 \times N \rfloor$ be the number of segments to be encrypted. Let $E(x_i)$ represent the encryption of segment $x_i$. The encrypted segments are: $E(x_{\pi(1)}), E(x_{\pi(2)}), \ldots, E(x_{\pi(M)})$

The verification process

Decryption of Encrypted Segments: For each of the $M$ encrypted segments, apply the decryption function $D(E(x_{\pi(i)}))$ to obtain the decrypted version of the segment $x_{\pi(i)}$:

$x_{\pi(i)}' = D(E(x_{\pi(i)}))$
Restoring the Shuffled Order: Since the segments were shuffled during the encryption process, they need to be restored to their original order using the inverse permutation $\pi^{-1}$:

$x' = (x_{\pi^{-1}(1)}', x_{\pi^{-1}(2)}', \ldots, x_{\pi^{-1}(M)}')$
Merkle Tree Hash Calculation: Recalculate the Merkle Tree hash of the decrypted segments in the restored order. Construct the hash tree similarly to the original construction, but use the decrypted segments $x'$:

$h'(x_i') = \text{HashFunction}(x_i')$ $h'(x_{i,j}') = \text{HashFunction}(h'(x_i') | h'(x_j'))$
Finally, the derived original Merkle root hash $h'_ \text{root}$ is obtained by hashing the two children of the root hash $h'_ \text{root} = \text{HashFunction}(h'(x_{1,2}') | h'(x_{3,4}'))$.
Consensus is achieved if the derived merkle root hash matches the original merkle root hash.

Calculating Merkle Root Hash

Consider a scenario involving a file containing the subsequent content:

FileFileGo_Network

Upon uploading a file to a storage provider, the merkle root hash of the file is computed through the segmentation of its content into distinct data segments.

The ensuing illustration depicts a simplified manifestation of the file's arrangement on the storage medium. Each individual box within the illustration symbolizes 1 byte of data.

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ F │ i │ l │ e │ F │ i │ l │ e │ G │ o │ _ │ N │ e │ t │ w │ o │ r │ k │
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘

To find the merkle root hash of this file, we break the file into smaller parts. For example, let's split the file into nine sections, and each part will have only two bytes.


    0       1       2       3       4       5      6       7        8
┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┐
│ F   i │ l   e │ F   i │ l   e │ G   o │ _   N │ e   t │ w   o │ r   k │
└───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘

Now we take the hash of each segment:

segment 0: hash("Fi"), denoted by h0
segment 1: hash("le"), denoted by h1
segment 2: hash("Fi"), denoted by h2
segment 3: hash("le"), denoted by h3
segment 4: hash("Go"), denoted by h4
segment 5: hash("_N"), denoted by h5
segment 6: hash("et"), denoted by h6
segment 7: hash("wo"), denoted by h7
segment 8: hash("rk"), denoted by h8

and then we calculate the merkle root hash of the file by applying the algorithm.

Here's an example of how this algorithm operates:

            ┌───┬───┬───┬───┬───┬───┬───┬───┐
Data Blocks:│ a │ b │ c │ d │ e │ f │ g │ h │
            └───┴───┴───┴───┴───┴───┴───┴───┘
              0   1   2   3   4   5   6   7
              │   │   │   │   │   │   │   │
              └───┘   └───┘   └───┘   └───┘
               h01     h23     h45     h67
                │       │       │       │
                └───────┘       └───────┘
                h(h01+h23)     h(h45+h67)
                    │               │
                    │               │
                    └───────────────┘
         Merkle root:  h(h(h01+h23)+h(h45+h67))

Now, we possess a merkle root hash for the file, represented as mt(f), which is essentially another hash value.

Data Request

When a request to retrieve data reaches a storage provider, the provider rearranges the data segments in a random order. For instance, consider the sequence of data segments:

random segments [ 1, 5, 2, 4, 7, 6, 3, 0, 8 ], which translates to the following arrangement:


   1       5        2       4      7        6       3       0       8

┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┐
│ l   e │ _   N │ F   i │ G   o │ w   o │ e   t │ l   e │ F   i │ r   k │
└───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘

Subsequently, the provider generates a symmetric key and initialization vector (IV) to encrypt a portion of these segments. In this illustration, we'll opt for encrypting 25% of the segments, which equates to 2 segments. Furthermore, we'll encrypt every 4 segments, implying that we will encrypt the 0th and 4th segments:

                       25% Segment Encryption = 2 segments

┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┐
│ l   e │ _   N │ F   i │ *   * │ w   o │ e   t │ l   e │ *   * │ r   k │
└───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘

Now, the aforementioned data will be provided to the data requester. Simultaneously, the key/IV, the randomized order of segments, and the contents of segments 0 and 4 are transmitted to the data verifier. It's important to highlight that the downloader possesses Zero-Knowledge regarding both the order of segments within the file and the encryption key/IV.

You might be concerned about the possibility of someone creating a script to attempt various combinations of segments to determine the original order, potentially leading to a security vulnerability and a potential attack.

To provide further insight, consider that a file is divided into approximately 1024 segments (or slightly fewer) in a real-world scenario, and these segments are then randomized. For an attacker to reconstruct the original segment order, they would need to carry out a "permutation without repetition." The total count of ways to arrange these file segments is given by n! (factorial), which amounts to 1024! in this instance. (https://coolconversion.com/math/factorial/What-is-the-factorial-of_1024_%3F)

The attacker's subsequent step involves attempting to acquire the key and IV used for encrypting the two segments. However, it's worth noting that this task is currently considered impossible based on existing vulnerabilities in the field.

Following this, the file downloader must request the encryption key/IV and the randomized order of file segments from a designated data verifier within the network.

Verification

The data downloader sends a request to the data verifier, seeking the encryption key/IV and the randomized segments. This request is accompanied by the segment hashes of the downloaded file, which are presented as follows:

h1
h5
h2
h(enc(4))
h7
h6
h3
h(enc(0))
h8

The data verifier undertakes encryption and hashing for segments 0 and 4, resulting in the following hash values:

h1
h5
h2
h4
h7
h6
h3
h0
h8

Lastly, the data verifier reorganizes the segments according to the randomized order generated by the file hoster during the data transfer to the requester. This process yields the original sequence of segment hashes:

h0
h1
h2
h3
h4
h5
h6
h7
h8

Ultimately, through the execution of the merkle root hash computation, the data verifier deduces the original merkle root hash without necessitating complete local access to the entire file content.

Upon confirming that the derived merkle root hash matches the original one, we have effectively established a mathematical proof that the data downloader possesses all the requested data. Subsequently, the data verifier transmits the encryption key/IV and the randomized segments order to the data downloader, leading to the automatic release of fees to the file hoster.

Flow

In this section, the complete life cycle of a data transfer verification is demonstrated.

Data Discovery: Numerous wire protocols have been developed to facilitate communication among nodes in a network. Among these protocols, the Data Query protocol is often the first utilized by nodes. This protocol enables nodes to broadcast queries throughout a gossip channel and retrieve responses via direct communication. Essentially, a node sends a request inquiring about which node hosts a particular piece of data.

             1. Data Query Request
                    ┌───────┐
    ┌───────────────►[nodes]├───────────────┐
    │               └───────┘               │
┌───┴────┐                             ┌────▼───┐
│        │                             │        │
│ node_1 │                             │ node_2 │
│        │                             │        │
└───▲────┘                             └───┬────┘
    │        2. Data Query Response        │
    └──────────────────────────────────────┘

Smart Contract: The Data Query Response payload contains all the information needed to prepare a smart contract transaction. This transaction is then broadcasted to the network which is then selected by a verifier.

┌──────────────────────────────────────┐
│              TRANSACTION             │
├──────────────────────────────────────┤
│  Data :                              │
│        - Data query response         │
│        - Remote node signature       │
│  Value:                              │
│        - Fees required by node       │
│                                      │
│  Fees :                              │
│        - Fees collected by verifier  │
│                                      │
│  To   :                              │
│        - Network verifier            │
└──────────────────────────────────────┘

Verification: Verifier(v1) communicates with the participating nodes and generates a challenge for the node which hosts the data(node_2). The challenge consists of the following steps:

node_2 should create a Merkle tree that matches the original Merkle root of data_x uploaded in the first place.
v1 decides the order and the number of blocks/data ranges to be sent to node_1 by node_2. We don't want to reveal the order of blocks to node_1 yet.
v1 asks node_2 for a fixed range of data, which will be encrypted using a random key k1 as data_enc by v1 and sent to node_1.

In this stage, node_1 possesses some data_z and data_enc but lacks the knowledge of how to combine them to obtain the original file. The verifier, v1, is able to verify the integrity of the data transmitted to node_1 and, if they match the original Merkle tree's identity, the decryption key k1 is provided to node_1. Additionally, the block order is sent to node_1, enabling the reassembly of all the parts to form the original data. Once this process is complete, v1 releases the fees to node_2.

The use of this algorithm enables the simultaneous attainment of Proof of Transfer and Proof of Data Possession.

Installation

Follow the instructions to compile and install filefilego

https://filefilego.com/documentation/docs/installation.html#prerequisites

FileFileGo Components

Features

FileFileGo is a decentralized network that incorporates the robustness of Blockchain/Cryptocurrency, DHT, and BitTorrent's innovative technology to form an unassailable infrastructure.

The platform employs Blockchain technology for indexing and other network metadata and logic, ensuring a secure and efficient system.
Encrypted traffic protects user data from third-party traffic inspection, while a privacy-centric design relays traffic through a set of intermediate peers.
The peer-to-peer design replicates the network's state on each full node, enhancing data reliability.
The network's native cryptocurrency serves as the "fuel" and guarantees an extremely low and conditional transaction fee compared to Ethereum/Bitcoin.
With a dynamic block size and block-time of 10 seconds, FileFileGo ensures quick and seamless transactions.
FileFileGo also offers an RPC interface that allows developers to build DApps on the network.

Blockchain Consensus Algorithm

To achieve a block-time of 10 seconds, FileFileGo requires a consensus algorithm that is both efficient in processing a high volume of transactions and conserves processing power. For the initial phase, we have selected Proof of Authority (PoA) as our consensus algorithm. In the future, a Proof of Stake (PoS) mechanism will replace the current algorithm.

Using PoW-based algorithms for new blockchains poses a risk, as there are already substantial pools of computing power available that could be used for 51% attacks. Therefore, we have opted for PoA, which is safe by design and provides the necessary efficiency to support our high transaction volume requirements.

Proof of Authority / Validator+Verifier Algorithms

The identities of validators are hardcoded into the blockchain and can be verified by examining the Genesis block's coinbase transaction. Participating nodes can easily verify the authenticity of these identities by checking the block's signatures.

Proof of Stake

As we move forward, the current PoA mechanism will be replaced by proof-of-stake to enable multiple parties to participate in the block mining process. Our goal for blockchain governance is to encourage more parties and developers to become involved and increase stakeholder engagement. One of the incentives for achieving this goal is the Proof-of-Stake mechanism.

Blockchain and Metadata/Accounting

To simplify transaction and state mutation, FileFileGo adopts a different approach than UTXO-like structures. Rather than using such structures, we store accounting and metadata as regular database rows, while retaining the raw blocks in their original format within the database. This approach helps to eliminate unnecessary complexity.

Technical Details

In this section, we will provide an overview of technical terms and concepts used in FileFileGo.

Channels

Channels in FileFileGo enable users to organize and group data into distinct buckets or folders. For instance, all content on ubuntu.com could be placed in a channel named "Ubuntu Official." The user who creates a channel receives all permissions necessary for updates and other channel-related operations.

Channels are structured in a node-chain format and can be identified as a node without a ParentHash.

Sub Channel

The concept of a sub-channel is to be able to categorize data even further. For instance, documents, pictures, or music.

Entry & File/Directory

In filefilego an Entry represents a post or a piece of data that contains more information about the entry itself rather than categorization/ordering. File and Directory can be placed into an Entry.

Data Storage Layer

Storage Engine is the storage layer that tracks binary data, which are used by hash pointers within the blockchain to refer to a piece of data. The NodeItem structure has a field called FileHash which refers to the binary hash and is in the form of "{HASH_ALGORITHM}:>{DATA_HASH}". We would like to keep the metadata of the hashing algorithm used as it might be useful in the future.

Full-text Index/Search

In FileFileGo, search accuracy and flexibility are equally important as the core blockchain functionality. We aim to enable users to construct complex queries, including binary searches, using a specific query language. For example, queries of the following types should be possible:

Required or inclusive ("filefilego coin"), which means both "filefilego" and "coin" are required in the search results.
Optional or exclusive ("filefilego currency"), which means one of those words can be excluded from the search results.

Developing a query language that supports such complex queries is a powerful tool that can significantly enhance the search engine's accuracy.

It is also possible to enable a node's full-text indexing functionality using the --search CLI flag.

Storage Engine

The storage layer keeps track of binary files and uses hashes to represent a piece of information within the blockchain. This feature can be turned on by using the following flags:

... --storage --storage_dir="/somewhere/to/store/data" --storage_token="somelongtokenhere" --storage_fees_byte="10000" ...

--storage_dir should be a directory that exists with appropriate read/write permissions. Please note that full nodes can work without this mechanism. storage_token is a token that grants admin rights to a token so it can create other tokens using the HTTP API. This is useful when access right is needed by web apps or distinct users and --storage_fees_byte="10000" is the fees charged per byte of data.

Coin Distribution

The Coin

Unit	Value
FFGOne	1
KFFG	1.000
MFFG	1.000.000
GFFG	1.000.000.000
MicroFFG	1.000.000.000.000
MiliFFG	1.000.000.000.000.000
FFG (Default unit)	1.000.000.000.000.000.000
ZFFG	1.000.000.000.000.000.000.000

Total Supply: 500 Million FFG Validation/Stake Reward: 40 FFG per Block Supply Decrease Rate: Divide by 2 every 24 months

filefilego's People

Contributors

Stargazers

Watchers

Forkers

omahs connector comethx vijayraghav-io nitetrik saishravanthreddy bainaryglobe bainaryglobe koschos

filefilego's Issues

Perform block mutation through a goleveldb transaction

The code for updating the blockchain state should utilise a single transaction. At many points the functions might fail, and for this reason we want to revert back the transaction.

Add test for data transfer rpc API

Test the data_transfer rpc API methods

Document off-chain download contracts

The protocol doesnt track the data transfers between nodes. It tracks a list of contract hashes which is exchanged between participating nodes. We should document this design properly to show the advantages and the improvements to privacy

Add list addresses to cli

when running filefilego address list we get an empty list

Remove contracts from memstore that are completed or older than x days

The idea is to remove contracts that are completed or old/inactive. This is to make sure we don't have inactive contracts in memory

Super light node documentation

Add specification and definition of a super light node

Add extra methods to channel rpc api

Add the following:

GetNodeItem(ctx context.Context, nodeHash string) given its node hash return the node item data
ExtractFilesFromEntryFolder(ctx context.Context, nodes string) returns a list of node items of type FILE

Guard file decryption mechanism to avoid interrupting an ongoing decryption

RPC calls to decrypt files are from separate goroutines so we should have a mechanism to stop a call when there is already an ongoing one

Block filenames, entries, channels, subchannels etc with sensitive names

The idea is to block sensitive keywords from transactions contents ( channels, entries, filenames, sub channels, folder etc.) / filenames being uploaded to the server.

In order to achieve this, an extensive list of keywords are needed. A regex would be useful to block these type of contents

Transactions by address pagination should be sorted from newest to oldest

The pagination should include the recent transactions first. goleveldb supports seeking, so a quick solution is to use an iterator, jumpt to the last, and start reading using .Prev()

Important: using the .Last() will skip the last since we start a for loop with for iter.Prev(), so inside the loop go forward one element if the index is zero

E2E testing using SDK

Scenario:
under /test/e2e create a test which utilizes the go SDK and create 2 file hoster nodes, 2 data verifiers, 1 block verifier and 1 data requester node and write scenarios to interact with the nodes.

Panic when syncing

Describe the bug

Node panics when syncing on windows

To Reproduce
Sync the blockchain with nil block response

Screenshots
https://media.discordapp.net/attachments/1096416033906118797/1096792168112341094/image.png?width=1440&height=345

Add domain type for NodeItem so we can encode bytes to hex

In GetNodeItem(r *http.Request, args *GetNodeItemArgs, response *GetNodeItemResponse) error we return the protobuf NodeItem which its []byte fields will be base64 encoded when marshalled using the JSON marshaller

The idea is to create a domain type and transform from the protobuf

Add excluded rpc methods to rpc namespaces

The goal is to allow node operators, to have better control over what RPC methods are allowed/disallowed.
This can be achieved by a global middleware (similar to the one for CORS in main.go) which will inspect the request before its dispatched to the controller, and stop the request if the method is not allowed.

File size agnostic when downloading

When downloading a file, the size is passed to the client rpc method, this is not necessary since the file size can be derived from the contract

Concurrent file downloads

Support multithreaded file downloads in the RPC/SDK

Fix unlock address to be able to unlock a node identity key

The idea is to be able to unlock a node identity key or any address generated in the keystore directory

Fix Reflected cross-site scripting

We return JSON however we could sanitize any html there

func writeHeaderPayload(w http.ResponseWriter, status int, payload string) {
	w.WriteHeader(status)
	// nolint:errcheck
	w.Write([]byte(payload))
}

Fix syncing with slow nodes

Some nodes on the network are extremely slow and when syncing with them, it extremely affects the syncing time.

We need to implement a way to prioritise nodes with high speed

No retry when sending data to verifier

In data_verification.go inside handleIncomingFileTransfer(s network.Stream) the data is sent to verifier without retry mechanism. I think there should be a way to retry in case network fails

Resumable file downloads

Currently there is the functionality to download a file using byte ranges and a multi-threaded approach to download the content concurrently. An additional mechanism is needed to resume the file download from where it was interrupted for each file part since each worker downloads a list of ranges. The information regarding which part and offset are being downloaded can be found within the contractStore and the DownloadFile function in the rpc and data verfifier protocols

Extend data verification protocol to avoid transferring unencrypted file segments

Download contracts are created and signed by a verifier. The file hoster sends a list of unencrypted file segments to verifier which will be always in a deterministic order, therefore if there was a recent file verification and the verifier has already these file segments, there is no need to retransfer these data to the data verifier.

Steps:

Extend the protocol to allow the file hoster to query the data verifier if unecnrypted segments are needed
If not needed don't send any data, except the key, iv, and the randomized segments

Document the cli client commands

We have a list of commands that can be run by the cli ( ./filefilego client help):

   endpoint                         endpoint http://localhost:8090/rpc
   upload                           upload <filepath>
   get_storage_token                get_storage_token <admin_token>
   balance                          balance <address>
   send_transaction                 send_transaction <access_token> <nounce> <data> <from_address> <to_address> <tx_value> <tx_fees>
   unlock_node_identity             unlock_node_identity <passphrase>
   query                            query filehash1,filehash2,filehash3
   responses                        responses <data_query_request_hash>
   create_contracts                 create_contracts <data_query_request_hash>
   create_send_tx_with_contracts    create_send_tx_with_contracts <contract_hash1,contract_hash2> <jwt_access_token> <current_nounce> <each_tx_fee>
   download                         download <contract_hash1> <file_hash> <file_size>
   send_file_signature_to_verifier  send_file_signature_to_verifier <contract_hash1> <file_hash>
   decrypt_files                    decrypt_files <contract_hash> <file_hash1,file_hash2> <restore_full_path_file1,restore_full_path_file2>
   host_info                        host_info

document what these commands do

Update outdated documentation

There might be outdated documentation and commands, verify if everything is up to date with version 2

Fix redownload functionality

The redownload functionality must take into account data that have already been downloaded in the store. This must be handled properly in case its a resume download and not a redownload #62

Handle resumable downloads

In func (d *Protocol) RequestFileTransfer(...) we should incorporate resumable download mechanism.

We rely on file segments to perform merkle hashes and data verification. The current implementation took into account downloading ONLY file block ranges rather than specific byte ranges.

The problem with this approach is that a client with unstable internet connection can be easily interrupted and redownloading a file block would be required. Now imagine if a block of file is 500MB, it would be extremely inefficient to restart a download from the beginning or even worse, many times a connection might drop for whatever reason.

The final design MUST allow for file ranges like the HTTP protocol. We MUST allow byte range requests and continue to support the encryption scheme for PoX.

The solution to this problem is:

Introduce from - to byte range request
For a given byte range request, we would like to see which file segments need to be encrypted
For instance, if we have a file block of 8bytes, and we need the last 4 bytes, we would have to start our encryptor and encrypt from the begining of the block, XOR it with the key and IV and then continue with the remaining 4 bytes that we have, XOR them and send those XORed 4 bytes back to the user.
With the above design we can support byte range requests

On the client side, as a good practice it would be a good idea that the from - to ranges are the start to end of a file block segment, because this will decrease the amount of processing end extra XORing. If not the algorithm would still work normally but with a cost of extra XORing operation

Data Contracts Improvements

We can improve the data contract functionality from multiple nodes. See link for more info:
https://forum.filefilego.com/viewtopic.php?f=11&t=15

Add CORS to rpc endpoint

We should add global CORS to JSON-RPC server so web applications running in the browser can communicate to the node

Add data verification rpc API

Create a json rpc api under /rpc which exposes the following apis/functionalities:

Send data query request ✓
Send a data query response check ✓
Send a call to verifiers to get data query responses (useful when requester is behind NAT and can't be called by the file hoster) ✓
Create a download contract from the host responses by combining the available responses and create multiple contracts if needed and send contracts to verifier for signing ✓
Send contracts to the participating nodes ✓
Send a transaction with the download contract details ( this could be done from the wallet since everything can be in a transaction payload)
Download file given a contract hash and filehash ✓
Get download stats of a file ✓
Send merkle tree nodes of a downloded file ✓
Send key iv and raw data to verifier (this process is started when a file hoster is contacted from file requester. At the same time data will be sent to verifier and sent to downloader ) ✓
File decryption by asking data encryption keys for a contract ✓

Create rpc and client SDK methods to prepare transactions with their data contract payload

Given a list of contract hashes prepare transactions with their download contract data

Improve data query request hash to avoid duplicated data query hashes

In a data query response we use the data query request's hash in order to link the request with the response. The hash is calculated by the file hashes needed and the requesters peer id.

Solution: Add extra field to the data query request so we get a unique hash if 2 different requests are done for the same set of files

Calling fileDownloader1Client.RequestEncryptionDataFromVerifierAndDecrypt more than 1 time doesnt behave correctly

Describe the bug
When the first time a call to fileDownloader1Client.RequestEncryptionDataFromVerifierAndDecrypt is done, it works properly, however since the source file is already decrypted from the client side, by calling it another time it tries to XOR the supposed to be decrypted file segments which are already decrypted.

To Reproduce
In the E2E test you can see fileDownloader1Client.RequestEncryptionDataFromVerifierAndDecrypt and call it twice

Expected behavior
When multiple calls are executed, check if data was already decrypted so we can avoiding XORing the file segments again.

Smart contract to join verifier network

We need a smart contract that allows staking an amount of coins to join the verifier network

Release fees to file hoster from verifier

We should release the download contract fees when a file requester requests the transfer of encryption data.

Steps:

When fees are released, we should store them in db/inmem so if subsequent requests arrive, we guard the release fees from releasing multiple times.
Check if all file hashes needed from the contract are delivered and verified
Check the fees
Release the fund

DataVerificationProtocol.verifyContract should check for valid fees

Need to check valid fees in the transaction value and all the contracts together to see if the submitted amount matches all the required contract fees. For each contract, we have a fee for each GB set by the host, we need to calculate the size of files and required fees

Report file transfer progress between nodes

When a file is transfered to a requester node, we would like to report the total bytes transfered so it can be queried using the RPC method by the wallet or another app.

Inside the contract store, there should be an additional field to indicate the total bytes transfered.

Add transaction timestamp from block time

We need to add transaction timestamp to the rpc methods especially the transaction.ByAddress

Windows bug with file uploading

Describe the bug
When a file is being uploaded, there is a bug in windows

failed to move uploaded file: rename C:\Users\ffg\Downloads\storage/2023-04-18/0x1b3577852f49be16041952fc088687c937a348fc74b869c101119bbf89abc102da63f3256960ebb6 C:\Users\ffg\Downloads\storage/2023-04-18/e9f42a242416c6f91d45b0464ff24ea8f617e26d: The process cannot access the file because it is being used by another process."

To Reproduce
Just setup a storage node and upload content

Expected behavior
Shouldn't fail

Add linter to github actions

There is a make file command make lint which calls the linter with the desired settings. We should be able to use this part of our CI pipeline

Add authorized rpc method to check if an access token is authorized

The goal is for a client to be able to introspect the token it currently possesses.
The use case is to check if an app is logged in or not

Create Channel NodeItems through SDK and RPC

The idea is by giving the channel NodeItem info to the RPC/SDK it creates a transaction and broadcasts it. We can have 2 approaches, if we want more control over the process, the function could return a raw transaction and the client decides when to send it or another approach is the function takes care of everything in one call

Transaction pool transfer protocol

We need an additional protocol to transfer transactions from one node's pool to another. This is useful in cases where gossip fails to propagate the message to the network. The gossip pubsub doesn't guarantee delivery.

Add cli commands to interact with a node through JSONRPC2.0 using the SDK

The functionality exposed by the SDK should be available as a cli to communicate to a node using the JSONRPC2.0.
This design will basically mean we will have a cli that does everything the GUI Wallet does

Project Still Active?

Hello,

I would like to discuss this project with you more and to also see if it is still active.

Best Regard and looking forward to hearing from you.

Add pagination to GetAddressTransactions

Adding pagination is necessary GetAddressTransactions(address []byte) ([]transaction.Transaction, []uint64, error)
A lightweight client could utilize the JSONRPC and paginate through transactions

Add list addresses to rpc and sdk

Introduce super light node protocols

A super light node doesn't store a local blockchain but can query the rest of the network for blockchain state, including address state (balance and nounce of an address) and channels metadata. The specification of the super light node can be found in the documentation.

Note that a super light node is not trustless and relies on the rest of the network. When querying the network using these protocols, we should aggregate results from several nodes and decide on the state based on what the majority of the nodes return

Enable super light mode with --super_light_node flag

Dynamic channel creation fees

Currently, channel creation is set to be a fixed fee. The idea is to be able to adjust it by the validator nodes.

Things to have in mind:

TXs containing channel creation payload are validated if they contain the right amount
This means we need to keep track of the changed fees so we can apply them properly for a node that is syncing from the beginning to avoid failed validations.

Add client SDK

Add the golang client SDK to interact with the JSONRPC2.0 interface.
It should be able to interact with all the rpc methods and namespaces

Remove block verification on startup

When a full node starts, it loads every block from genesis until the one found in the database and it tries to verify it. This process is heavy so we would like to disable it by default, and use a flag to start it with block verification

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.