#49325 Proof of concept: optional rust queue for sds / ns
Closed: wontfix 4 years ago Opened 4 years ago by firstyear.

Issue Description

Rust is a modern systems programming language in the style of C and C++. It has strict compile time guarantees to the correctness of applications, and promises the potential to reduce many times of security and stability issues including bounds checking, null dereferences, use-after free and more. This is achieved without a run time, instead using compile time ownership and lifetime checks.

This ticket is to add a proof of concept that we can use Rust as an FFI language with existing C components. This adds an optional configure argument to enable a rust thread safe queue which is used by nunc-stans for event dispatch.

My tests already show it is safe (it passes all the existing test) and the server when built with this option passes the basic suite.

Importantly it shows how we can integrate cargo with autotools, and how to expose and C compatible apis from rust.

To use this, at configure time add "--enable-rust".

There are no other changes to the server without this flag, and it is not a requirement to build DS, it's optional.


Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review
- Custom field type adjusted to defect

4 years ago

A few comments -- we were already talking about these on #rust.

  • Setting the environment CARGO_TARGET_DIR will let you avoid polluting srcdir.
  • Use cargo build --manifest-path=path/to/Cargo.toml to avoid changing directory.
  • For the free_fn, I suggest storing Option<&'static fn(*const c_void)>, and create that from the incoming value using pointer.as_ref(). This puts the unsafety in just one place, and also takes advantage of smaller NonZero enums for Option<&T>.
  • In sds_tqueue_destroy, you can be more explicit using drop (already imported in the prelude) instead of a throwaway value.

Hey mate,

I've applied the changes for the build, that made things a bit nicer.

Issue with the free fn, is that C isn't giving us a ref, so when you do free_fn.as_ref, you get a &fn, but it's representing a fn. This causes a segfault because you have the wrong number of pointer indirections. That's why I used transmute here instead. The way to resolve this could be to have the caller provide a &free from the C, side, but that involves me changing some other parts (not that I'm against that)

With mem::drop, if you drop(q), you only drop the ref to q and it causes a memory leak. Finally, you can't drop (*q) because you can't move out of a raw pointer deref. This is why I used from_ptr with box to move ownership to this scope and let the implict drop take place (leak free :) )

I've also replaced vecDequeue with linked list because our original C queue is a LL, and second it's much faster (1100 ops/sec vs 2299 ops/sec). This now comfortably smashes the C queue variant, as it only achieves 1900 ops/sec.

0001-Ticket-49325-Proof-of-concept-rust-tqueue-in-sds.patch

I see what you mean about the fn indirection. Technically that is a pointer underneath already, but you can't treat it directly as such. Still, there should be a better way...

I think you can use Option<extern fn()> in FFI declarations -- bindgen does this, which is going the other direction, but I think it works both ways. So try:

pub extern fn sds_tqueue_init(retq: *mut *mut TQueue,
                              free_fn: Option<extern fn(*const c_void)>)
                              -> sds_result

Maybe even try bindgen on sds.h just to see what it declares. I'll bet it's close to this.

I meant to use mem::drop after doing Box::from_raw, rather than letting _q implicitly fall out of scope. You could also just not assign the from_raw value to anything, to let it drop right away. There's no functional difference to any of these, so it's just what you think will be clearer.

0001-Ticket-49325-Proof-of-concept-rust-tqueue-in-sds.patch

Ahhh, see what I didn't understand was the Option null optimisation here. Reading https://doc.rust-lang.org/nightly/book/first-edition/ffi.html#the-nullable-pointer-optimization made it clear. I can now see your suggestion.

Anyway, I've commented it in the patch, and I've re-build and run test.s I think it's much nicer now. Thanks for the tip @jistone !!

Metadata Update from @firstyear:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field version adjusted to None

4 years ago

Just for style, in Drop I'd use if let Some(f) = self.free_fn instead of the match. Same semantic effect, but less rightward drift.

In TQueue::enqueue and dequeue, I think these should probably take &self instead of &mut self. Otherwise the Mutex is redundant since &mut already promises exclusive access. You might even be tempted to use Mutex::get_mut instead of locking normally. If the C callers might be aliasing these calls, &self properly reflects that.

Okay, changed the style. You need them to be &mut self due to the mut borrow in mutex, and you need to use lock() because get_mut() relies on the borrow checker - but does not actually perform any locking.

At this point I'm pretty happy with the patch now, so if it's okay with you @jistone I'll see if @mreynolds or @tbordaz want to review it for inclusion now :)

0002-Ticket-49325-Proof-of-concept-rust-tqueue-in-sds.patch

@firstyear to be honest I am not the right person to do a technical review of that patch. The code was looking quite generic and clean so as a proof of concept it looks good to me.
My understanding is that it substitute tqueue.c by tqueue.rs in libsds. Is libsds used in nunc-stans and so nunc-stans going to run with rust code ?

@tbordaz I thought you would be interested to have a look at it from our discussion the other week. I'm glad you think it looks clean.

Yes, the idea is that it can drop-in and replace the tqueue. By default, we still use the C version, but if we are happy to go forward with this, I would make this the default in the future. It's a simple yet stressed piece of code, so it gives us a good feel for the ability to use this, and also shows the reliability of the system :)

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.3.7.0

4 years ago

@firstyear

You need them to be &mut self due to the mut borrow in mutex

I don't understand what you mean. The mutex only needs &self for lock(), and that returns a MutexGuard which implements DerefMut to do mutable things from there. The whole point of the mutex is to get mutable access from a shared reference.

It is undefined behavior to have &mut that may be aliased, e.g. from multiple threads that want to wait on the lock. Mutex::get_mut exists because if you have &mut then there's no point in touching the lock at all, since you're already exclusive. An aliasing violation probably won't blow up if you manually avoid such &mut methods, but you're playing with fire. Since you're getting raw pointers from FFI that definitely can alias, they should only be treated as shared references.

Mate, the issue is that here get_mut relies on the complier to check that the access is exclusive. If this was pure rust, that is true, but it's not, we have a C FFI, so we have potentially references available.

So as a result, if you have get_mut on a mutex it does NOT take the lock, it only returns an empty guard. This is safe in pure rust because the compiler checks that there is only one owner of the *mut mutex, and that is where the safety comes from.

But here, because we give this as a *mut from C ffi, there are possibly many threads - we have a violation of rusts safety principle. So get_mut's properties are no longer satisfied, and get_mut causes crashes :)

Really, I should be changing the type to Shared<T> instead of raw *mut, but that still doesn't mean we can use get_mut.

Hope that helps,

I'm not saying you should use get_mut. I'm saying that the fact that you've put yourself in a position where you could use get_mut, because you have &mut self, should be a big red flag that you're not representing things correctly. It doesn't really matter whether the pointers are *mut or *const, but references need to correctly represent Rust invariants, including the fact that &mut should never be aliased.

You still haven't explained, why can't these be &self?

Indeed you are correct: These can be &self. Attached patch updates this, and changes two sections to use *const instead.

I would like to use Shared<T> but it's unstable for now.

Thanks again!

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)
- Issue set to the milestone: 1.4 backlog (was: 1.3.7.0)

4 years ago

0001-Ticket-49325-Proof-of-concept-rust-tqueue-in-sds.patch

Hey @mreynolds

Sorry to be a pain, I tweaked this for inclusion into the rpm (optional still) if you provide --enable-rust to ./configure, then make rpms.

By default, it's not included.

This should make it easier for others who use the rpm builds to test,

Can you just check this over again? I've tried with rust build, without rust build, with rust rpm, and without rust rpm builds.

Thanks,

Everything looks fine, but why are the LTLIBRARIES overwritten? We don't need nunc-stans with rust? Just curious

@mreynolds It's not over-written, it's += so it appends. The quirk though is:

server_LTLIBRARIES =
if RUST_ENABLE
server_LTLIBRARIES += librsds.la
endif
server_LTLIBRARIES += libsds.la libnunc-stans.la libldaputil.la libslapd.la libns-dshttpd.la

They have to be in order, and rsds is needed for sds, which is needed for ns. Thus the odd looking flow. I'll add a comment about "why". If that's okay then, I'll wait for your ack to merge,

commit 6ec27bc
To ssh://git@pagure.io/389-ds-base.git
f02cbdd..6ec27bc master -> master

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

4 years ago

There is an issue with linking this. Reopen to fix.

Metadata Update from @firstyear:
- Issue status updated to: Open (was: Closed)

4 years ago

Metadata Update from @firstyear:
- Issue close_status updated to: fixed

4 years ago

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review (was: ack)
- Issue status updated to: Open (was: Closed)

4 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)

4 years ago

26f78c1
To ssh://git@pagure.io/389-ds-base.git
ead3168..26f78c1 master -> master

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2384

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

2 years ago

Login to comment on this ticket.