cksac / dataloader-rs Goto Github PK
View Code? Open in Web Editor NEWRust implementation of Facebook's DataLoader using async-await.
License: Apache License 2.0
Rust implementation of Facebook's DataLoader using async-await.
License: Apache License 2.0
This issue is to request a new release (0.7) with an updated dependency to tokio 0.2 and the relevant futures version (or std futures)
Not exactly sure what is wrong here, probably something with futures having changed.
/U/d/De/dataloader-rs [master@0248d52e50a2d]
$ rustup override set nightly
info: using existing install for 'nightly-x86_64-apple-darwin'
info: override toolchain for '/Users/davidpdrsn/Desktop/dataloader-rs' set to 'nightly-x86_64-apple-darwin'
nightly-x86_64-apple-darwin unchanged - rustc 1.40.0-nightly (e413dc36a 2019-10-14)
/U/d/De/dataloader-rs [master@0248d52e50a2d]
$ cargo build
Compiling futures-timer v0.2.1
error[E0277]: the trait bound `ext::TimeoutStream<S>: futures_core::stream::Stream` is not satisfied
--> /Users/davidpdrsn/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-timer-0.2.1/src/ext.rs:170:9
|
170 | impl<S> TryStream for TimeoutStream<S>
| ^^^^^^^^^ the trait `futures_core::stream::Stream` is not implemented for `ext::TimeoutStream<S>`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0277`.
error: could not compile `futures-timer`.
To learn more, run the command again with --verbose.
Hello,
I'm wondering why cached loader have to handle errors, I don't see any particular treatment done on the results in this crate. Wouldn't it be simpler if loader can handle whatever value users want it to handle.
Loader could be declared as follows:
pub struct Loader<K, V, F, C = HashMap<K, V>>
where
K: Eq + Hash + Clone,
V: Clone,
F: BatchFn<K, V>,
C: Cache<Key = K, Val = V>,
This would still allow for anyone to put a (cloneable) Result
as output value.
In particular I'd like to use the loader to return a Vec<Result<_, _>>
for each key given to the BatchFn
; What do you think about this design change ?
Hi, there! Thank you for this crate ๐
I'd like to use it extensively, but is it still maintained? The latest change was made at Aug 4 2018 and master
depends on tokio-core
, which is deprecated long ago.
How about upgrading to the latest tokio
and refactoring with std
Future
s which will come in the nearest 1.36 Rust release? Would you mind on my elaboration on it?
I have returned to my abandoned GraphQL project :) I'm trying to solve N+1 problem using dataloader-rs
along with juniper
and actix-web
. I have two simple entities: Article
and Author
. One article have many authors. Below I will show the most important places of my program.
Actix handler:
use crate::graphql::context::Context;
use crate::graphql::schema::{create_schema, Schema};
use actix_web::{web, Error, HttpResponse};
use juniper::hProblemttp::GraphQLRequest;
use std::sync::Arc;
async fn graphql(
ctx: web::Data<Context>,
schema: web::Data<Arc<Schema>>,
req: web::Json<GraphQLRequest>,
) -> Result<HttpResponse, Error> {
let response = req.execute(&schema, &ctx).await;
let json = serde_json::to_string(&response)?;
Ok(HttpResponse::Ok()
.content_type("application/json")
.body(json))
}
Authors dataloader:
use crate::environment::db::Database;
use crate::models::author::Author;
use dataloader::BatchFn;
use std::collections::HashMap;
use tokio_postgres::Row;
pub struct AuthorsLoader {
pub db: Database,
}
#[async_trait::async_trait]
impl BatchFn<i32, Author> for AuthorsLoader {
async fn load(&self, keys: &[i32]) -> HashMap<i32, Author> {
let client = self.db.get_client().await.unwrap();
let map: HashMap<i32, Author> = client
.query(
"SELECT * FROM articles.authors WHERE id = ANY($1)",
&[&keys.to_vec()],
)
.await
.unwrap()
.into_iter()
.map(|r: Row| (r.get("id"), Author::from(r)))
.into_iter()
.collect();
map
}
}
Article type resolver:
use crate::{graphql::context::Context, models::{author::Author, article::Article}};
use itertools::Itertools;
#[juniper::graphql_object(Context = Context)]
impl Article {
fn id(&self) -> &i32 {
&self.id
}
fn title(&self) -> &str {
&self.title
}
async fn authors(&self, ctx: &Context) -> Vec<Author> {
let a = ctx.authors_loader
.load_many(self.author_ids.clone())
.await
.values()
.cloned()
.collect_vec();
a
}
}
Deps:
[dependencies]
log = "0.4"
chrono = "0.4"
actix-web = "2.0"
actix-rt = "1.0"
env_logger = "0.7"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
juniper = { git = "https://github.com/graphql-rust/juniper" }
dotenv = "0.15"
tokio = "0.2"
tokio-postgres = "0.5"
dataloader = "0.12"
futures = "0.3"
async-trait = "0.1"
itertools = "0.9"
The panic occurs when the server tries to load authors through load_many
method:
thread 'actix-rt:worker:0' panicked at 'found key 0 in load result', <::std::macros::panic macros>:5:6
What I'm doing wrong?
0.5.1
from crates or 0.6.0-dev
? When will the new release be in the crates.io?
I am wanting to only load requested fields from the database. For example, take this GraphQL type:
type Person {
id
name
birthday
}
With the following request:
person {
id
name
}
Rather than doing SELECT * FROM people
, I want to only get the id
and name
fields; not birthday
.
It seems that in the official Node based module they recommend using separate cache and load keys. This means BatchFn
might use (PersonID, Vec<String>)
as its ID, but Loader
would use PersonID
. Multiple load keys map to one cache key; so it'd be up to the BatchFn
implementation to dedup the person/fields combinations.
I am using async_graphql and dataloader and it works fine.
But when I want to limit output using last
argument. I do not know how to do this with dataloader.
authors {
name
books(last: 30)
}
Getting books is called like this in graphql object ( without data loader )
pub async fn books(
&self,
ctx: &Context<'_>,
last: i32,
) -> Result<Vec<Books>, AppError> {
ctx.data_unchecked::<BooksRepository>()
.get_for_id(self.id, last)
.await
}
When using dataloader I do not know how to make it work.
pub async fn books(
&self,
ctx: &Context<'_>,
last: i32,
) -> Result<Vec<books>, AppError> {
let loader = ctx.data_unchecked::<BooksLoader>();
loader.load(self.id).await
}
Maybe I just overlooked something or just do not have mu day, but I do not know how to solve it. Thanks for suggestion.
I've noticed that sometime batches works, and some times not... I'm using a very simple schema at the moment, sending the same request multiple time can give different results in how the loader is executed:
query foo {
orders {
id
user { id }
}
}
I'm using a loader on the orders.user
field:
#[juniper::graphql_object(Context = Context)]
impl Order {
fn id(&self) -> juniper::ID {
self.id.to_string().into()
}
async fn user(&self, ctx: &Context) -> types::User {
ctx.loaders.user.load(self.user.clone()).await.unwrap()
}
}
and the UserLoader
is basically a copy past of the example in juniper doc
db contains 2 orders, both having the same user field. Here is some logs of the same request send multiple times:
DEBUG hyper::proto::h1::io > read 643 bytes
DEBUG hyper::proto::h1::io > parsed 14 headers
DEBUG hyper::proto::h1::conn > incoming body is content-length (109 bytes)
DEBUG hyper::proto::h1::conn > incoming body completed
DEBUG syos::graphql::utils::loaders::user > load batch [ObjectId(593fdd2ba9c0edf74ff0b38c), ObjectId(593fdd2ba9c0edf74ff0b38c)]
INFO GraphQL > 127.0.0.1:45080 "POST /graphql HTTP/1.1" 200 "http://127.0.0.1:4444/graphiql" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 9.229761ms
DEBUG hyper::proto::h1::io > flushed 372 bytes
DEBUG hyper::proto::h1::io > read 643 bytes
DEBUG hyper::proto::h1::io > parsed 14 headers
DEBUG hyper::proto::h1::conn > incoming body is content-length (109 bytes)
DEBUG hyper::proto::h1::conn > incoming body completed
DEBUG syos::graphql::utils::loaders::user > load batch [ObjectId(593fdd2ba9c0edf74ff0b38c)]
DEBUG syos::graphql::utils::loaders::user > load batch [ObjectId(593fdd2ba9c0edf74ff0b38c)]
INFO GraphQL > 127.0.0.1:45080 "POST /graphql HTTP/1.1" 200 "http://127.0.0.1:4444/graphiql" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 12.157163ms
DEBUG hyper::proto::h1::io > flushed 372 bytes
DEBUG hyper::proto::h1::io > read 643 bytes
DEBUG hyper::proto::h1::io > parsed 14 headers
DEBUG hyper::proto::h1::conn > incoming body is content-length (109 bytes)
DEBUG hyper::proto::h1::conn > incoming body completed
DEBUG syos::graphql::utils::loaders::user > load batch [ObjectId(593fdd2ba9c0edf74ff0b38c), ObjectId(593fdd2ba9c0edf74ff0b38c)]
INFO GraphQL > 127.0.0.1:45080 "POST /graphql HTTP/1.1" 200 "http://127.0.0.1:4444/graphiql" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 12.120952ms
DEBUG hyper::proto::h1::io > flushed 372 bytes
DEBUG hyper::proto::h1::io > read 643 bytes
DEBUG hyper::proto::h1::io > parsed 14 headers
DEBUG hyper::proto::h1::conn > incoming body is content-length (109 bytes)
DEBUG hyper::proto::h1::conn > incoming body completed
DEBUG syos::graphql::utils::loaders::user > load batch [ObjectId(593fdd2ba9c0edf74ff0b38c), ObjectId(593fdd2ba9c0edf74ff0b38c)]
INFO GraphQL > 127.0.0.1:45080 "POST /graphql HTTP/1.1" 200 "http://127.0.0.1:4444/graphiql" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 10.009887ms
DEBUG hyper::proto::h1::io > flushed 372 bytes
We can see that some times load batch is called with 2 ids, some times called twice with the same id.
More infos:
threaded_scheduler
and basic_scheduler
with the same result)I can provide more code if necessary, or maybe even a simple repo to reproduce the issue.
this test fails by panicking when I add requesters like below:
let load_fn = LoadFnForEmptyTest;
let loader = Loader::new(load_fn.clone()).with_max_batch_size(4);
let l1 = loader.clone();
let h1 = thread::spawn(move || {
let r1 = l1.try_load(1337);
let r2 = l1.try_load(1338);
let (f1, f2) = block_on(futures::future::join(r1, r2));
assert!(f1.is_err());
assert!(f2.is_err());
});
let _ = h1.join().unwrap();
This is because this line expects value to be present at state.complete
, while early-returning here may make values to be discarded (this is why the test passes when there is only one requester). Possible fix would be to put error handling after batch completion, along with failed
state which would hold <requestId, key>
(without introducing this state, we would have no way to attach key
to error message as now)
That's pretty much what the reference implementation does: https://github.com/graphql/dataloader/tree/90353d8d34063f92c7c6300d66d0e9ce0a8d51c4#batch-function.
Indexing into a HashMap
is notoriously slower than indexing into a Vec
. On top of that, Rust, by default, uses a computationally expensive hashing algorithm.
But yeah, that'd be a breaking change for end users.
Before Rust 1.39, I successfully used this library. But with the advent of async/await, I was completely confused. Below is an example of my broken code:
#[juniper::object(Context = Context)]
impl Article {
fn id(&self) -> i32 {
self.id
}
fn title(&self) -> &str {
self.title.as_str()
}
async fn authors (&self, context: &Context) -> FieldResult<Vec<Author>> {
let authors = context.authors_loader().load_many(self.author_ids.clone());
Ok(authors.await)
}
}
I'm trying to load an authors of the article with dataloader, but I got confused with the return types and how to get authors from the dataloader. The compiler gives me an error:
error[E0728]: `await` is only allowed inside `async` functions and blocks
--> src/schema.rs:52:12
|
40 | #[juniper::object(Context = Context)]
| ------------------------------------- this is not `async`
...
52 | Ok(authors.await)
| ^^^^^^^^^^^^^ only allowed inside `async` functions and blocks
error[E0308]: mismatched types
--> src/schema.rs:52:12
|
52 | Ok(authors.await)
| ^^^^^^^^^^^^^ expected struct `std::vec::Vec`, found enum `std::result::Result`
|
= note: expected type `std::vec::Vec<_>`
found type `std::result::Result<std::vec::Vec<_>, dataloader::LoadError<()>>`
error: aborting due to 2 previous errors
cargo check --examples
error[E0433]: failed to resolve: could not find `document_join_macro` in `futures_util`
--> /Users/takahashiatsuki/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.3.1/src/lib.rs:504:15
|
504 | futures_util::document_join_macro! {
| ^^^^^^^^^^^^^^^^^^^ could not find `document_join_macro` in `futures_util`
error[E0433]: failed to resolve: could not find `document_select_macro` in `futures_util`
--> /Users/takahashiatsuki/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.3.1/src/lib.rs:528:15
|
528 | futures_util::document_select_macro! {
| ^^^^^^^^^^^^^^^^^^^^^ could not find `document_select_macro` in `futures_util`
error: aborting due to 2 previous errors
It seems to be relevant to graphql-rust/juniper#659 . So I changed juniper
branch from async-await
to master
, then above error is disappeared but another bunch of compile errors arise. I couldn't solve these errors because I'm not familiar with juniper
.
I hope that someone will fix this. Thanks.
Would it be possible to include an lru-based cache implementation in order to better manage memory consumption? Something like https://github.com/maidsafe/lru_time_cache would be very useful.
On a side note, is there any particular reason the yield_count property hard-coded to 10? Thanks!
Ex when a search result returns not just the keys but the data itself, it'd be good to populate the cache from the response. Thus any new request to the already know values can be omitted.
As I see prime can be used to add values to the cache, but it performs a lock on each addition. prime_many could accept an iterator of (K,V) pairs and could hold the lock for the entire update.
Or is there any other option to fill/add to the loader a known value ?
shouldn't .load() return Result<HashMap<_, _>>
rather than just the hashmap itself? this forces me to panic on failed db request...
Would you mind publishing the latest version with #33 to crates.io?
Hello! Thanks for this crate!
I'm evaluating a stack for a web app (including postgres, actix, and juniper) and I'm concerned about potentially very bad performance on the backend <-> database connection. It seems like more than caching, I'd need the ability to specialize specific types of queries to be able to optimize bottlenecks (i.e. the most common / expensive queries should be done in a single SQL query).
It seems dataloaders are a possible fix for this but I don't have a ton of context on GraphQL or dataloaders so I'm still trying to figure out what exactly they do and if they'll solve the problem I have or if I need something else or have to drop GraphQL.
It's a bit difficult for me without documentation of this crate. I've looked through the code and and it seems well written and pretty small overall though, so I think I'd be able to add docs myself.
Would you be interested in accepting pull requests to fully document this crate (for docs.rs)?
Additionally, would you be interested in enabling a lint in CI (or directly in the Rust code) to deny missing docs?
If I have to understand everything myself anyways, I might as well document everything so that it'll be easier for future people to use this. But before I commit to spending that time, I want to check with you to see if you'd be willing to accept PRs for documentation.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.