uazu / qcell Goto Github PK
View Code? Open in Web Editor NEWStatically-checked alternatives to RefCell and RwLock
License: Apache License 2.0
Statically-checked alternatives to RefCell and RwLock
License: Apache License 2.0
Some possible additions to the API (contributed by pythonesque):
I also proved that the translations
Lcell<T> -> T
,&LCell<[T]> -> &[LCell<T>]
, and&mut T <-> &mut LCell<T>
, are sound for all T, which I don't think your code has yet.
We have this:
/// Stores multiple DataSource capable of InstanceTitle, InstanceBaseUrl and
/// RepoListUrl
#[derive(Default)]
pub struct ConfigManager {
// conceptually the actual objects
bases: Vec<Arc<RwLock<dyn DataSourceBase + Send + Sync>>>,
// conceptually just views of the above objects
titles: Vec<Option<Arc<RwLock<dyn DataSource<InstanceTitle> + Send + Sync>>>>,
urls: Vec<Option<Arc<RwLock<dyn DataSource<InstanceBaseUrl> + Send + Sync>>>>,
repolists: Vec<Option<Arc<RwLock<dyn DataSource<RepoListUrl> + Send + Sync>>>>,
durations: Vec<Duration>,
// add_source can be called after update.
valid: usize,
}
It is our understanding that we would be able to do something like this:
/// Stores multiple DataSource capable of InstanceTitle, InstanceBaseUrl and
/// RepoListUrl
#[derive(Default)]
pub struct ConfigManager {
owner: QCellOwner,
// conceptually the actual objects
bases: Vec<Arc<QCell<dyn DataSourceBase + Send + Sync>>>,
// conceptually just views of the above objects
titles: Vec<Option<Arc<QCell<dyn DataSource<InstanceTitle> + Send + Sync>>>>,
urls: Vec<Option<Arc<QCell<dyn DataSource<InstanceBaseUrl> + Send + Sync>>>>,
repolists: Vec<Option<Arc<QCell<dyn DataSource<RepoListUrl> + Send + Sync>>>>,
durations: Vec<Duration>,
// add_source can be called after update.
valid: usize,
}
And this would be much faster than both RwLock
and even hypothetical Arc locking, while still giving us Send+Sync
. Is that accurate?
So opt-out features such as no-thread-local
are not the correct approach. If one crate depending on qcell
opts out, but another one doesn't, it will be forced to opt out anyway as things are at the moment (because cargo builds just one instance of the crate with the sum of all features enabled). This will mean things not working correctly, i.e. probably it will panic immediately on running.
So probably it is best to split out global and thread-local-based TCell
as two different types instead of using features.
TLCell
already supports a kind of transfer of ownership, but only between threads, due to there being an owner in each thread.
A QCell
-like cell could potentially support transferring ownership since ownership is determined by the key value stored in the cell. However that key value is immutable. To make it mutable means either using an atomic type or else maybe by making the cell non-Sync and using a plain Cell
to contain the key value.
The proposed address-based-key cell (issue #14) could also support transferring ownership to some other owner's address in a similar way, if there was an ID type that could be used to pass the other owner's address. (We don't need &mut
on the second owner in order to transfer ownership to it, just the address.)
The other cells (TCell, TLCell and LCell) can't support transferring ownership because ownership is hardcoded at compile-time and is checked by the compiler, and so cannot be modified at runtime.
Hello,
can here be an api like that in std? So that signature would look like
pub fn rwn<'a, T: ?Sized, const N: usize>(&'a mut self, qc: [&'a QCell<T>; N]) -> [&'a mut T; N];
I was just thinking about how to restructure qcell
after #9, and I came up with a way to generalize QCell
, TCell
, TLCell
, and LCell
into a single generic type!
The core interface would be:
use core::cell::UnsafeCell;
pub unsafe trait ValueCellOwner: Sized {
type Proxy;
fn validate_proxy(&self, proxy: &Self::Proxy) -> bool;
fn cell<T>(&self, value: T) -> ValueCell<Self, T>;
fn owns<T: ?Sized>(&self, cell: &ValueCell<Self, T>) -> bool {
self.validate_proxy(cell.owner_proxy())
}
fn ro<'a, T: ?Sized>(&'a self, cell: &'a ValueCell<Self, T>) -> &'a T {
assert!(self.owns(cell), "You cannot borrow from a `ValueCell` using a different owner!");
unsafe { &*cell.as_ptr() }
}
fn rw<'a, T: ?Sized>(&'a mut self, cell: &'a ValueCell<Self, T>) -> &'a mut T {
assert!(self.owns(cell), "You cannot borrow from a `ValueCell` using a different owner!");
unsafe { &mut *cell.as_ptr() }
}
fn rw2<'a, T: ?Sized, U: ?Sized>(
&'a mut self,
c1: &'a ValueCell<Self, T>,
c2: &'a ValueCell<Self, U>,
) -> (&'a mut T, &'a mut U) {
assert!(self.owns(c1), "You cannot borrow from a `ValueCell` using a different owner!");
assert!(self.owns(c2), "You cannot borrow from a `ValueCell` using a different owner!");
assert_ne!(c1 as *const _ as usize, c2 as *const _ as usize, "You cannot uniquely borrow the same cell multiple times");
unsafe { (&mut *c1.as_ptr(), &mut *c2.as_ptr()) }
}
}
pub struct ValueCell<Owner: ValueCellOwner, T: ?Sized> {
owner: Owner::Proxy,
value: UnsafeCell<T>
}
impl<Owner, T> ValueCell<Owner, T>
where
Owner: ValueCellOwner
{
pub fn from_proxy(owner: Owner::Proxy, value: T) -> Self {
Self { owner, value }
}
}
impl<Owner, T: ?Sized> ValueCell<Owner, T>
where
Owner: ValueCellOwner
{
pub const fn as_ptr(&self) -> *mut T {
self.value.get()
}
pub const fn owner_proxy(&self) -> &Owner::Proxy {
&self.owner
}
}
All of the current types could be modeled like so,
type QCell<T> = ValueCell<RuntimeOwner, T>;
struct RuntimeOwner {
id: u32
}
struct RuntimeProxy(u32);
unsafe impl ValueCellOwner for RuntimeOwner {
type Proxy = RuntimeProxy;
fn validate_proxy(&self, proxy: &Self::Proxy) -> bool {
self.id == proxy.0
}
fn cell<T>(&self, value: T) -> ValueCell<Self, T> {
ValueCell::from_proxy(RuntimeProxy(self.id), value)
}
}
Since Rust guarantees no dangling references or use-after-free in safe code, it should be possible to use the address of the cell-owner as the key to access the cells, storing the address as the key in the cell.
If the owner is moved, then it loses access to the cells (which would typically be a bug in the user's code). Also if the owner is dropped and another owner created in the same memory, it will gain access to all the cells previously owned by the old owner. But this doesn't cause soundness problems, because there is still just one owner at any one time. Also access to a cell requires both a pointer to the cell and also the owner's key. So it really doesn't cause any issues that some other code might get logical ownership as it can't get access unless it also has pointers to the cells.
Personally, I don't think that this is required. All the cases so far where I've needed simultaneous borrows have been handled by rw2
. However if there are any concrete cases where 4+ simultaneous borrows are needed, and it would be inefficient to handle them as a sequence of rw2
or rw3
borrows, it would be good to document them and analyse them. So please add a comment if you have such a requirement. If it would really work out as more efficient to borrow 4+ items at a time (considering the roughly O(N^2) comparisons), there is a draft PR #26 which could be finished off to provide this functionality.
For one of my projects I am using a fork of LCell
with the generativity crate to avoid the requirement for closures. Is this something you would be interested in having contributed back to this crate? If so I will put together a PR.
We don't suppose it'd be possible to have something along these lines?
struct IntrusiveQCellOwner<T: ?Sized> {
inner: Arc<QCell<T>>,
}
// impl<T: ?Sized> !Clone for IntrusiveQCellOwner<T>;
impl<T> IntrusiveQCellOwner {
fn new(value: T) -> Self {
Arc::new_cyclic(|arc| {
QCell { owner: QCellOwnerID(arc.as_ptr() as *const () as usize), value: UnsafeCell::new(value) }
})
}
}
}
impl<T: ?Sized> IntrusiveQCellOwner {
fn id(&self) -> QCellOwnerID {
self.inner.owner
}
// the usual stuff, rw etc.
fn clone_inner(&self) -> Arc<QCell<T>> {
self.inner.clone()
}
// in fact, might aswell just use Deref and DerefMut for these.
//fn mut_inner(&mut self) -> &mut T {
// ...
//}
//fn get_inner(&self) -> &T {
// ...
//}
}
The main use-case would be the same as #16 except without a separate "owner".
pub struct ConfigManager {
resources: Vec<Resource>,
}
pub struct AddConfigSource<'a, T: DataSourceBase + Send + Sync + 'static> {
resource: &'a mut Resource,
source: Arc<QCell<T>>,
}
struct Resource {
// actual resource
base: IntrusiveQCellOwner<dyn DataSourceBase + Send + Sync>,
// views of the actual resource
title: Option<Arc<QCell<dyn DataSource<InstanceTitle> + Send + Sync>>>,
url: Option<Arc<QCell<dyn DataSource<InstanceBaseUrl> + Send + Sync>>>,
repolist: Option<Arc<QCell<dyn DataSource<RepoListUrl> + Send + Sync>>>,
}
impl ConfigManager {
pub fn add_source<T>(&mut self, source: T) -> AddConfigSource<'_, T>
where
T: DataSourceBase + Send + Sync + 'static,
{
let base = IntrusiveQCellOwner::new(source);
let arc = base.clone_inner();
self.resources.push(Resource::new(base));
AddConfigSource {
resource: self.resources.last_mut().unwrap(),
source: arc,
}
}
}
impl<'a, T: DataSourceBase + Send + Sync + 'static> AddConfigSource<'a, T> {
pub fn for_title(self) -> Self where T: DataSource<InstanceTitle> {
let arc = &self.source;
self.resource.title.get_or_insert_with(|| {
arc.clone()
});
self
}
pub fn for_base_url(self) -> Self where T: DataSource<InstanceBaseUrl> {
let arc = &self.source;
self.resource.url.get_or_insert_with(|| {
arc.clone()
});
self
}
pub fn for_repo_lists(self) -> Self where T: DataSource<RepoListUrl> {
let arc = &self.source;
self.resource.repolist.get_or_insert_with(|| {
arc.clone()
});
self
}
}
My usecase is in a programming language, for type-inferring. I have type variables in Rc<TCell<>>
, such that multiple places can share their inferred type automatically. The problem is that a type can contain another type variable (that might not be known yet), so I have some situations where I would effectively need to get the innermost value (mutably) out of a Rc<TCell<Rc<TCell<_>>>>
. I have some workarounds for this, but they are all limiting in some way.
I think the nested access pattern should be possible for all the cells, since the owner is borrowed mutably and so no other accesses can happen simultaneously. Is my reasoning wrong? If not, it would be great if we could get a function like
fn TCellOwner::rw_nested<T, U>(cell: &TCell<T>, get_nested: impl FnOnce(&mut T) -> &TCell<U>) -> &mut U;
On a related note, we could also introduce a lower-level, UnsafeCell
-like unsafe API to get pointers without having the owner.
This would allow one to implement the above function oneself, by using the fact that the owner is borrowed mutably to fulfill the safety conditions of, say,
/// Subject to similar conditions to UnsafeCell::get
fn TCell::ptr(&self) -> *mut T;
(This should then also exist for all the cells)
Currently, LCellOwner is zero-sized, but &LCellOwner and &mut LCellOwner are not. We wonder if it's possible to have a sound API which borrows the LCellOwner but is zero-sized?
This is a bit of a micro-optimization but anyway.
There are new APIs proposed in 95228 which may eventually lead to the deprecation of ptr as usize
casts.
qcell uses these casts in many places, namely in functions -> QCellOwnerId
, as well as to check the uniqueness of cells in rw2/rw3 calls.
Since they're only used for comparisons, I don't think qcell really needs to worry about retaining provenance information (i.e. the change would be to replace ptr as usize
with ptr.addr()
).
I figured I'd open this issue to keep an eye on strict_provenance
and prepare to update qcell if/when those APIs become stable.
It was pointed out that maybe it's possible to get access to both a structure and a member of that structure at the same time using rw2
, which would mean two &mut
to the same memory region. Try to create an example which reproduces this, then see if there's any way to save the functionality.
This would be possible with TCell and LCell, but not with QCell. This would be useful when new cells need to be created whilst the owner is borrowed.
use qcell::{TCell, TCellOwner};
type T1 = fn(&());
type T2 = fn(&'static ());
// T1 subtype of T2, both 'static
// TCellOwner covariant
fn _demo(x: TCellOwner<T1>) -> TCellOwner<T2> {
x
}
// and that's obviously bad
fn main() {
let first_owner = TCellOwner::<T2>::new();
let mut second_owner = TCellOwner::<T1>::new() as TCellOwner<T2>;
let mut x = TCell::<T2, _>::new(vec!["Hello World!".to_owned()]);
let reference = &first_owner.ro(&x)[0];
second_owner.rw(&x).clear();
println!("{}", reference); // ��&d��i
// (or similar output)
}
Using compile_fail doctests is convenient, but the code can fail to compile for other reasons than the intended one. So the only way to be sure is to strip out all the compile_fail
markers, and recheck all the failures one by one. So using the compiletest_rs crate might make things easier in the long run.
Cell
, RefCell
, and Mutex
all have an into_inner
method which allows you to extract the wrapped value, but QCell
(and TCell
and LCell
) doesn't seem to have such a method.
We feel like LCell could just impl Default?
(Also TCell and TLCell?)
I believe qcell can support TCell without either of those features enabled, via refactoring the TCellOwner into an unsafe trait with most of the behaviour plus the current struct implementing this new trait. Then you can introduce a simple macro for creating new owner types that uses a static AtomicBool for uniqueness. Something like
pub unsafe trait CanOwnTCell {
fn ro<'a, T: ?Sized>(&'a self, tc: &'a TCell<Self, T>) -> &'a T {
// Copy implementation over
}
// All the other cell-related methods
}
unsafe impl<Q> CanOwnTCell for TCellOwner<Q> {}
macro_rules! make_tcellowner_type {
($visibility:vis, $owner:ident, $bool_name:ident) => {
static $bool_name: AtomicBool = AtomicBool::new(false);
$visibility struct $owner {
_phantom: PhantomData<()>
}
impl $owner {
pub fn try_new() -> Option<Self> {
$bool_name.compare_exchange(false, true, Relaxed, Relaxed).ok().map(|_| Self { _phantom: Default::default() })
}
}
unsafe impl CanOwnTCell for $owner {}
impl Drop for $owner {
fn drop(&mut self) {
$bool_name.store(false, Relaxed);
}
}
}
}
This would be a breaking change, in that now TCell
has to specify the full TCellOwner<Q>
as the owner type, but if you keep the old TCell<Q, T>
as a type alias type TCell<Q, T> = TCellTraited<TCellOwner<Q>, T>;
I think you can downgrade it to just a major change.
std
's Rc
, Arc
, and ThreadId
have essentially the same pathological edge case as QCellOwnerSeq
— incrementing the reference count (owner id sequence) in a loop without ever decreasing it (i.e. by forgetting the cloned Rc
/Arc
, or always for the monotonic ThreadId
and QCellOwnerSeq
counts) could overflow the counter and lead to UB. std
declares this as pathological unsupported behavior — when would you ever legitimately need even 231 different owning references or threads, let alone 2631 — and aborts if this overflow would happen2.
At least on 64-bit targets (where exhausting the ID space is fundamentally impractical3), it'd be nice to have QCellOwnerSeq::new
be made safe. I'd recommend doing a cmpxchg loop (fetch_update
) like ThreadId
, since creating new owners doesn't seem like it'd ever be a contended operation.
Doing this would entail one of:
QCellOwnerSeq::new
safe (in theory not API breaking, but can cause downstream unused_unsafe
warnings and makes the non-panicking path now potentially panicking); orQCellOwnerSeq::new
's unsafe
requirements (from "don't misuse colliding IDs" to "don't exhaust the ID space") and making a separate constructor4 that does checked sequence based owner IDs.Since qcell doesn't expose a way to get the numeric value of a QCellOwnerID
(and I think this is a good decision), there's no way to implement this downstream, even partially.
For some context, with 128 threads all cooperatively incrementing the same shared counter at a rate of 6 GHz, it will still take 139 days at that rate to increment the the counter 263 times. It's effectively never going to happen accidentally. ↩
Rc
uses cell.set(cell.get() + 1)
and aborts when overflow happens. Arc
uses atom.fetch_add(1, Relaxed)
and aborts at isize::MAX
, relying on the fact that having isize::MAX
threads cloning Arc
concurrently is impossible (at least two bytes of address space are consumed per thread). ThreadId
uses a compare_exchange
loop, presumably because creating a new thread will never be performance critical like cloning Arc
can be. ↩
Counting to 263 is never going to happen in a reasonable timeframe (see previous footnote), but counting to 232 is entirely practical. OTOH, 32-bit targets with 64-bit atomic support could still use 64-bit owner IDs, zero-extending the address-based owner IDs. ↩
In both cases QCellOwner
could also switch to using checked sequence-based IDs to avoid the alloc requirement, but do note that this would result in the ID space being consumed without reuse by QCellOwner
as well as QCellOwnerSeq
, which it isn't currently. ↩
If someone wants to do their own key management (e.g. u64
key, panic on exhaustion instead of free list, etc), then have some types that they can plug their own 'key' type into, and that does the cell borrowing handling for them. These would typically be wrapped in type aliases to make them more ergonomic.
It might be necessary to have two sets of these, for each of TLCell
-like and TCell
-like, i.e. with or without Sync
access to the cells.
Heya, thanks for making this crate.
I'm currently porting a project over from RefCell
, and I'm finding that I really miss being able to derive Debug
on things that now contain QCells. Debugging anything in a QCell is requiring me to go in with a debugger every time, when I often just want to dump out what's in it. I typically can't use get_mut
or into_inner
in any of these places.
Do you have any suggestions to make this more ergonomic?
would it be possible to have a cell type which is similar to LCell
, but is based around the selfref
crate?
(maybe with generativity
? we're not familiar with that crate)
this is related to #30 but more general
specifically we want a thread-safe lock-free &mut SelfRefCellEnvironment
which opens a reusable LCell-like environment.
It should be possible to support large parts of the functionality in a no_std environment. So add a default "std" feature, and allow the crate user to disable it if they want no_std.
See the crate docs for the links to ghost_cell.rs. This uses lifetimes. It's not clear from the usage there how invasive the lifetime annotations would be in the code. It does offer one big advantage over TCell: no need for singletons. Apart from that the behaviour should be the same as TCell. Whether it is worth it depends on how complicated and confusing the lifetimes might be to the user.
Also, need to check licenses before copying any code from that repository.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.