Git Product home page Git Product logo

stellar-core's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stellar-core's Issues

Cleanup order of include files

For all files X.cpp, the first include should be X.h
This makes it simpler to avoid weird ordering issues of includes between cpp files.

Deployment prototype

(Moved from https://github.com/stellar/puppet/issues/131)

Need to set up a deployment scenario for Hayashi.

Would like to do this via an ASG so we add add/remove instances with minimal effort.

Will need some way to designate ownership of public DNS entries and Postgres databases in order to do this.

Hayashi Roadmap doc

I also had a chat with graydon re: deployment on Jan 16th that should help fuel things a bit.

What wasn't yet clear to me was how to configure trust relationships between the nodes, especially if the set is dynamic.

  • set up basic installation
  • Run installation in ASG
  • configure to talk to rds
  • set up dynamic "claiming" of rds & keys
  • determine bucket list strategy (do we reload history or swap and EBS volume?)
  • launch clusters as "staging" to give broader initial access
  • ami/snapshot cleanup helper

Quote SQL identifiers to so we are explicit about casing

According to SQL-92, identifiers that are not quoted are case-insensitive. Postgresql supports this by lower-casing all unquoted identifiers transparently. Sqlite does not have this problem.

Unfortunately some SQL toolkits (for example, both ActiveRecord and Sequel) will always quote identifiers for the SQL they produce. To integrate with the hayashi DB, a developer will have to manually convert to the lower-cased form, which could be confusing (well, it certainly was for me).

IMO, we should either:

  1. Continue to use mixed casing in our SQL, but quote all identifiers such that casing is as expected
  2. Do not rely upon casing, i.e. lowercase everything and use underscores for word separation.

Sleep OverlayManager until next connection attempt

OverlayManagerImpl knows when to run its next connection attempt, but it does not currently use this information to limit its sleep/wake cycle. Instead it schedules a tick every 2 seconds (OverlayManagerImpl::tick()). There's no need for this, it might as well sleep until the next connect attempt should be made.

Virtual clock doesn't quite sync often enough

When you run in real time mode, if there's nothing to do aside from pending future timers, no "real work" gets done and so crank(false) -- nonblocking -- returns immediately and declines to propagate real time through to virtual. This is explicit in the code but (a) I'm not sure why I put the && nWorkDone != 0 criterion there in the first place and (b) it's clearly not quite right even if it's heart is in the right place, since it causes the app to stall.

errors in the config file need to be more explicit

When something goes wrong in a config file error, just dumping out a std::runtime_error is not really good enough. Should at minimum write out which config item was being processed, ideally more context than that too if any is available.

Remove use of stringstream

Most of the uses of stringstream in the database code are superfluous and actually risk SQL injection. Use prepare against string constants with placeholders instead.

TxHistory table is not being populated

Given a hayashi node, running a new ledger, then introducing a valid payment from the root account to another:

curl http://localhost:39132/tx\?blob\=2e3c35010749c1de3d9a5bdd6a31c12458768da5ce87cca6aad63ebbaaef7432000003e80000000100000000000003e80000000000000000000000009d7d563f1648962f08cab1b00f086bf5726cbf5413138aaa9d956b285e4b9c3500000000000000000bebc20000000000000000000bebc2000000000000000000000000015a406e28841e7f8d47cfb768755d70fb6ae3ff656ce9d4769b50b6a2dde52bd0a0388498a92e20b885c630846174206abc63da4ef6b30c71a585c2c267f3bc0a

This results in a new record in Accounts table as expected, balances all appear to be correct, but TxHistory itself is still empty.

Metrics collection

Once hayashi is instrumented we'll need to collect metrics into a backend system.

Options for this in order of mat's preference:

  • https://github.com/Netflix/atlas
    • pros: multi-dimensional, can add new metrics w/o config, capable of extremely high scale
    • cons: very early option, OSS offering still lacks UI, alerting, deployment specifications
  • http://graphite.wikidot.com/
    • pros: can add new metrics w/o config, well-supported/understood in OSS community
    • cons: single dimensional, no alerting baked in, shard-based scaling
  • http://www.zabbix.com/
    • pros: alerting bundled, already deployed in SDF stack
    • cons: single dimensional, needs extra manual configuration, ACL & UI can be limiting

Switch from sha512/256 to sha256

It seems likely that sha256 might be a better choice than sha512/256. The latter is about 1.5x faster in 64bit software, but involves different constants from the main sha512 algorithm, is not as widely supported in other programming language standard libraries, and is not going to be supported by the new skylake-generation hardware instructions for sha256.

https://en.wikipedia.org/wiki/Intel_SHA_extensions
https://en.wikipedia.org/wiki/SHA-2#Comparison_of_SHA_functions

--stress mode for tests

Currently we try to keep most of the tests quick enough that they can run casually while working / doing CI. We should also have a stress-test mode that tries to see where performance problems show up as we scale up transaction rate and database size. It should also write out performance metrics.

postgresql backend should be explicitly enabled or disabled

Currently configure enables or disables the postgresql backend based on sniffing for libpq. This is too subtle and confuses users both ways: when they want it sometimes they don't get it, when they don't want it sometimes they get it and then tests fail when it tries to connect to a local postgres. Make it explicit.

Remove use of hash-based 'index' identity for ledger entries

Ledger entries are currently identified in many places by an intrinsic tuple-based identity, for example an offer is identified as its (owner, sequence) pair, or a trustline by its (owner, issuer, currency) triple. In other places, the code identifies ledger entries by hashing these triples into Yet Another Hash Value called the entry's index. This is potentially confusing and unnecessary -- especially since in the case of accounts it coincides with the account key -- and in any case there there's a class that handles comparing by intrinsic identity directly (LedgerKey). Remove uses of hash-based indexes.

Refactor Peer database access

Right now the peer database is updated/accessed via scattered SQL statements in the source code.
We should consolidate in one place (let's say PeerMaster that happens to define the schema).

Refactor database access in database.cpp

SQL calls should be centralized per class: right now writes are properly factored per entry type (ie "OfferFrame") but the reads are in database.cpp.
This makes schema management more complicated than it should be.

Add tombstones to CLF

CLF currently only stores live ledger objects (added or modified); it does not support tombstones. It needs to.

QuorumSet Protocol Failure

Here's two different runs, one failed, one successful, filtered on the async call to retrieveQuorumSet from FBA in Herder.cpp and the response from overlay recvFBAQuorumSet in Herder fro qSet 4b5e56.

In the successful run, node 918ecd does receive qSet 4b5e56 eventually as expected.

spolu@spolu-ThinkPad-T430s:~/src/stellar/hayashi$ cat worked | grep "Herder.*Quorum.*4b5e56"  
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@135ab0 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@918ecd qSet: 4b5e56

In the failed run, node 918ecd does not receive qSet 4b5e56 as expected. The simulation stops when no more timer is active, and 918ecd never receives it.

spolu@spolu-ThinkPad-T430s:~/src/stellar/hayashi$ cat failed | grep "Herder.*Quorum.*4b5e56"                                                                                                                                                                              
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@135ab0 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56

Any idea @jedmccaleb ?

Ensure all subsystems encapsulated behind pure-virtual interfaces

Edited to reflect conversation below:

  • Rename FooGateway to FooManager, forward-declare as many parameters/returns as possible.
  • Rename FooMaster to FooManagerImpl that implements that interface
  • Ensure every subsystem (top-level dir in src/) that has a central class owned by Application is hidden this way.
  • Make sure that any artifacts of previous pImpl strategy are removed (in particular: two-phase construction/destruction).

-- used to say --

This is cleanup but will improve compile times and cut out a lot of extra concept-names. For each FooGateway/FooMaster pair:

  • Remove FooGateway class
  • Add forward-declared class FooMaster::Impl and member std::unique_ptr<Impl> mImpl;
  • Explicitly declare ~FooMaster() and define it in the FooMaster.cpp file
  • Move FooMaster::* members to class FooMaster::Impl { ... } in the FooMaster.cpp file
  • Forward FooMaster methods to Impl methods as necessary, or just prefix variable accesses with mImpl-> and make FooMaster a friend of FooMaster::Impl
  • (Optional) rename FooMaster as FooManager. Slightly more dry/boring business term.

Crash: submit a blank request to /tx

Given a running stellard instance, running curl http://127.0.0.1:39132/tx will crash it.

➜  hayashi git:(master) ✗ ./bin/stellard
09/02/15 18:29:49 [default] INFO  Starting stellard-hayashi 25b8c8e
09/02/15 18:29:49 [default] INFO  Config from stellard.cfg
09/02/15 18:29:49 [default] INFO  Application constructing (worker threads: 8)
09/02/15 18:29:49 [default] INFO  Application constructed
09/02/15 18:29:49 [default] DEBUG TmpDir created tmp
09/02/15 18:29:49 [Overlay] DEBUG PeerDoor binding to endpoint 0.0.0.0:39133
09/02/15 18:29:49 [Overlay] DEBUG PeerDoor acceptNextPeer()
09/02/15 18:29:49 [FBA] DEBUG Node::cacheQuorumSet@41a4bc qSet: eca466
09/02/15 18:29:49 [FBA] INFO  LocalNode::LocalNode@41a4bc qSet: eca466
09/02/15 18:29:49 [Herder] DEBUG Herder::recvFBAQuorumSet@41a4bc qSet: eca466
09/02/15 18:29:49 [default] INFO  Listening on 127.0.0.1:39132 for HTTP requests
09/02/15 18:29:49 [default] INFO  Connecting to: sqlite3://stellar.db
09/02/15 18:29:49 [default] INFO  Loading last known ledger
09/02/15 18:29:49 [default] DEBUG PeerMaster tick
09/02/15 18:29:51 [default] DEBUG PeerMaster tick
09/02/15 18:29:53 [default] DEBUG PeerMaster tick
09/02/15 18:29:55 [default] DEBUG PeerMaster tick
09/02/15 18:29:57 [default] DEBUG PeerMaster tick
09/02/15 18:29:59 [default] DEBUG PeerMaster tick
09/02/15 18:30:01 [default] DEBUG PeerMaster tick
09/02/15 18:30:03 [default] DEBUG PeerMaster tick
09/02/15 18:30:05 [default] DEBUG PeerMaster tick
09/02/15 18:30:07 [default] DEBUG PeerMaster tick
09/02/15 18:30:09 [default] DEBUG PeerMaster tick
09/02/15 18:30:11 [default] DEBUG PeerMaster tick
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
[1]    73413 abort      ./bin/stellard

sanitize SQL schema

example of existing issues:

  • primary key missing/wrong
  • no index matching selects/update statements
  • invalid column types (sqlite does not do any type checking)
  • missing constraints (NOT NULL)

move to only use SQL bindings when communicating with SQL

We should properly use "user(foo)" and "into(bar)" constructs instead of constructing sql strings.

Even in the context of complex queries this can be achieved. See how LoadOffers works for an example.

Known offender (there might be more):
TrustLine

Warnings emitted everytime a transaction is posted: "there is already a transaction in progress, there is no transaction in progress"

Upon transaction submission, I'm seeing the following warnings emitted to stdout:

...

19/02/15 09:08:53 [FBA] INFO  Slot::attemptCommit@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [FBA] INFO  Slot::processEnvelope@7eb7f3 i: 5 {ENV@7eb7f3|COMMIT|(0,87af74)|4f852d}
19/02/15 09:08:53 [FBA] INFO  Slot::attemptCommitted@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [FBA] INFO  Slot::processEnvelope@7eb7f3 i: 5 {ENV@7eb7f3|COMMITTED|(0,87af74)|4f852d}
19/02/15 09:08:53 [FBA] INFO  Slot::attemptExternalize@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [Herder] INFO  Herder::valueExternalized@7eb7f3 txSet: e2401e
WARNING:  there is already a transaction in progress
WARNING:  there is no transaction in progress

...

The transactions do apply correctly, as expected.

revisision: 6e1ca86718dfa25ab8b41eba03b906689694e315

config:


PEER_PORT= 39133
RUN_STANDALONE=false
LOG_FILE_PATH="hayashi.log"

HTTP_PORT=39132
PUBLIC_HTTP_PORT=false

# what generates the peerID (used for peer connections) used by this node
PEER_SEED="s3BCUXncNvghHzKafx4gwYGaEG5rEeMUDdJPDsdjve3ojoFd5tK"
# what generates the nodeID (used in FBA) 
VALIDATION_SEED="s3BCUXncNvghHzKafx4gwYGaEG5rEeMUDdJPDsdjve3ojoFd5tK"

QUORUM_THRESHOLD=1
QUORUM_SET=["gxoicA8D962NezYaa4AmrhXKGHYbrELu8rhyKE2vt8osLHL3T5"]

DATABASE="postgresql://dbname=hayashi_development"

Crash: submitting a (presumably valid) transaction triggers crash in hexToBin

Given a running hayashi, submitting the following command:

curl http://127.0.0.1:39132/tx\?2e3c35010749c1de3d9a5bdd6a31c12458768da5ce87cca6aad63ebbaaef7432000003e80000000100000000000003e80000000000000000000000009d7d563f1648962f08cab1b00f086bf5726cbf5413138aaa9d956b285e4b9c350000000000000000000007d0000000000000000000000064000000000000000000000001af13cb78b2acc5885b47cbb8d5b6d65dcd6c2d7e11a4ef622598eab42aeec1f9f7c5ee01effb80eac4f50560eef376d47b14d61a13302c6052542e4d825b1502

Triggers a crash:

➜  hayashi git:(master) ✗ ./bin/stellard
09/02/15 18:32:24 [default] INFO  Starting stellard-hayashi 25b8c8e
09/02/15 18:32:24 [default] INFO  Config from stellard.cfg
09/02/15 18:32:24 [default] INFO  Application constructing (worker threads: 8)
09/02/15 18:32:24 [default] INFO  Application constructed
09/02/15 18:32:24 [default] DEBUG TmpDirMaster cleaning: tmp
09/02/15 18:32:24 [default] DEBUG TmpDir deleting: tmp
09/02/15 18:32:24 [default] DEBUG TmpDir created tmp
09/02/15 18:32:24 [Overlay] DEBUG PeerDoor binding to endpoint 0.0.0.0:39133
09/02/15 18:32:24 [Overlay] DEBUG PeerDoor acceptNextPeer()
09/02/15 18:32:24 [FBA] DEBUG Node::cacheQuorumSet@41a4bc qSet: eca466
09/02/15 18:32:24 [FBA] INFO  LocalNode::LocalNode@41a4bc qSet: eca466
09/02/15 18:32:24 [Herder] DEBUG Herder::recvFBAQuorumSet@41a4bc qSet: eca466
09/02/15 18:32:24 [default] INFO  Listening on 127.0.0.1:39132 for HTTP requests
09/02/15 18:32:24 [default] INFO  Connecting to: sqlite3://stellar.db
09/02/15 18:32:24 [default] INFO  Loading last known ledger
09/02/15 18:32:24 [default] DEBUG PeerMaster tick

....

09/02/15 18:33:14 [default] DEBUG PeerMaster tick
09/02/15 18:33:16 [default] DEBUG PeerMaster tick
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: error in stellar::hexToBin(std::string)
[1]    73458 abort      ./bin/stellard

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.