Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Arc::is_unique #560

Closed
SabrinaJewson opened this issue Mar 12, 2025 · 13 comments
Closed

Expose Arc::is_unique #560

SabrinaJewson opened this issue Mar 12, 2025 · 13 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@SabrinaJewson
Copy link

SabrinaJewson commented Mar 12, 2025

Proposal

Problem statement

There is currently no way to check whether an Arc has unique access to its data that doesn’t involve producing an &mut to the inner data. However, this is undesirable for some use cases as it invalidates existing pointers to the Arc.

Motivating examples or use cases

Suppose one has a self-referential data structure consisting of an Arc<T> and an &T, though the &T may also for instance point to static data and not the Arc, and the goal is to write a function that produces a T from this data structure. One implementation would be to first check whether if the Arc is unique and then check if the &T indeed points to the Arc, and if so, call Arc::into_inner, otherwise clone the T from the reference. However, currently the step of checking if the Arc is unique will force the &T to be invalidated, meaning one cannot later clone should the second check fail.

In a simple case one could just reorder the two checks to solve this issue, but this is not always trivial to achieve depending on what abstractions are involved: for example, if there were multiple possible backing Arcs, one would have to check all of the pointers before attempting to Arc::into_inner the first, instead of being able to short-circuit should the first succeed. In a more complex case, if the dependent data was not &T but had a destructor, there would be no way to correctly run it: if the destructor was run before get_mut or into_inner, it would be impossible to recover the original data to clone it later, but if it was run after, it would already be invalidated.

Solution sketch

impl<T, A> Arc<T, A> {
    pub fn is_unique(&self) -> bool;
}

One argument for the inclusion of this API directly is that, even when lifetime-extended references are not involved, some people might try and, instead of calling Arc::get_mut(&mut arc).is_some(), call Arc::strong_count(&arc) == 1 – indeed, grep.app displays many results for this pattern – and not be aware of the race condition this may introduce should weak references exist, leading to code that is subtly wrong. Introducing this method makes it more likely that people will choose the correct way of implementing this function.

Alternatives

impl<T, A> Arc<T, A> {
    pub fn as_mut_ptr(&mut self) -> Option<*mut T>;
}

This doesn’t come with the discoverability advantages of is_unique mentioned above. However, it does make it more explicit that one can obtain mutable access to the internal value, which might be convenient. It is currently not guaranteed that one can mutate through the pointer returned by Arc::as_ptrthere was one attempt to guarantee this that fizzled out – but it is unlikely that we will not end up guaranteeing this.

Another alternative is having the method take &mut self instead of &self. This can reduce footguns since is_unique is rarely useful when used as an &self method, but it is possibly outweighed by the flexibility.

Links and related work

None I’m aware of.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@SabrinaJewson SabrinaJewson added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Mar 12, 2025
@pitaj
Copy link

pitaj commented Mar 12, 2025

Seems like your motivating example is already covered by unwrap_or_clone.

@SabrinaJewson
Copy link
Author

unwrap_or_clone is orthogonal – it always clones the value inside the Arc, whereas I want to clone the value held by my reference, which is not necessarily in the Arc.

@pitaj
Copy link

pitaj commented Mar 12, 2025

Maybe I'm misunderstanding your example. Let me write out how I understand it in pseudocode:

if not ptr_eq(reference, as_ptr(arc)):
    return reference.clone()

if is_unique(arc):
    return into_inner(arc)

return reference.clone()

But since we've already established that the reference points into the arc, the last line could just as well be:

// return reference.clone()
return (*arc).clone()

And in that case, you can simply the is_unique and latter clone into:

// if is_unique(arc):
//     return into_inner(arc)
// 
// return (*arc).clone()
return unwrap_or_clone(arc)

By the time you've checked the refcounts to check if it's unique, you've already done all of the expensive stuff, so whether you clone the reference or from the arc with unwrap_or_clone makes no difference.

@SabrinaJewson
Copy link
Author

The entire point of my motivating example is that one can’t swap around the ptr_eq and is_unique checks like that. For example, in the multi-Arc case, it’s not ideal to have to call ptr_eq for every single Arc and store them in an array of booleans before you can attempt Arc::into_inner on the first. Though I think the limitation with Drop is more compelling:

// even if you could put this here:
if !dependent_data.depends_on(arc):
    return dependent_data.clone()

if let Some(data) = Arc::into_inner(arc):
    drop(dependent_data) // 💥: already invalidated!
    return data
else:
    return dependent_data.clone()

@pitaj
Copy link

pitaj commented Mar 12, 2025

So I misunderstood, I didn't realize that

One implementation would be to first check whether the Arc is unique and if the &T indeed points to the Arc

meant those checks in that order.

I'm not sure if putting the checks in that specific order is sufficient to avoid UB, though. Do you have some working code using Arc::get_mut(...).is_some() that you can share?

@SabrinaJewson
Copy link
Author

SabrinaJewson commented Mar 12, 2025

I'm not sure if putting the checks in that specific order is sufficient to avoid UB, though. Do you have some working code using Arc::get_mut(...).is_some() that you can share?

I’m confused what UB you mean. The current UB is precisely because of get_mut invalidating references, so no, I don’t have working code – I filed this issue so I it would be possible to implement working code. The order of the checks is just for architectural and not soundness reasons, though I later realized I can’t actually change it because of the Drop problem.


I realized I was slightly confused – my description was derived from a real case in which this function is necessary, but my attempts to simplify it have made it seem trivial. In particular, even if the get_mut check is before the ptr_eq check, one does know that should the former succeed but the latter fail, the pointer was not actually invalidated since it points to a separate memory region. But my case is the general one, in which it is possible for the pointer to be invalidated even though the ptr_eq check fails.

@pitaj
Copy link

pitaj commented Mar 12, 2025

I'm not sure if it's possible to write such a function that guarantees atomically that both the strong and weak count are 0. It would at least only be possible on platforms with double-usize atomics.

Never mind

@SabrinaJewson
Copy link
Author

To be clear, this function already exists – this ACP is just about making it public.

@marc0246
Copy link

I think this proposal is needlessly restrictive. There is also the use case which I've had where you own multiple Arcs and need to guarantee that those are the only ones. And the way to solve both of our problems would be to expose something like:

impl<T, A> Arc<T, A> {
    pub fn counts(&self) -> (usize, usize) {}
}

Ensuring atomicity.

@marc0246
Copy link

marc0246 commented Mar 13, 2025

Also on a side note, if we do go with Arc::is_unique, I believe the only reason Arc::get_mut takes a mutable reference to the Arc is so that you can't get aliasing mutable references to the data. So here, it should be a shared reference for maximum flexibility IMO.

@kennytm
Copy link
Member

kennytm commented Mar 13, 2025

impl<T, A> Arc<T, A> {
    pub fn is_unique(&mut self) -> bool;
}

as a public API, it is strange that it takes &mut self rather than &self (or actually this: &Self because of Deref), because both Arc::{strong_count, weak_count} take this: &Self.

@SabrinaJewson
Copy link
Author

SabrinaJewson commented Mar 13, 2025

And the way to solve both of our problems would be to expose something like:

I’m not against this, but I think it would be good to have is_unique still as a convenience method, and to avoid the user having to write the magic number (1, 0).


I updated the proposal to use &self.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting, and we agreed to accept this. Code searches suggest that the primary use of strong_count is to compare to 1.

Also, we discussed memory ordering, and felt that this probably warrants acquire ordering; we also felt that we should re-add the acquire ordering to strong_count as well, but that's a separate discussion.

@joshtriplett joshtriplett added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

5 participants