RFC guide | building secure I/O

motivation

Recently, Rust officially merged an RFC [1]. By introducing the concept of I/O security and a new set of types and characteristics, it provides users of AsRawFd and related characteristics with assurance about their original resource handle, so as to make up for the loophole of encapsulation boundary in Rust.

The Rust standard library provides I/O security to ensure that the program holds a private raw handle, and other parts cannot access it. But FromRawFd::from_raw_fd is Unsafe, so file:: from cannot be done in Safe Rust_ Raw (7) such a thing. I/O operations are performed on this file descriptor, which may be privately held by other parts of the program.

However, many API s perform I/O operations by accepting raw handles:

pub fn do_some_io<FD: AsRawFd>(input: &FD) -> io::Result<()> {
    some_syscall(input.as_raw_fd())
}

AsRawFd does not limit as_ raw_ The return value of FD, so do_some_io can finally perform I/O operations on any RawFd value. You can even write do_some_io (& 7), because RawFd itself implements AsRawFd. This may cause the program to access the wrong resources. It even breaks the encapsulation boundary by creating a handle alias private in other parts, resulting in some strange Action at a distance.

"Action at a distance is an anti pattern in programming [2], which means that the behavior of a part of the program will be widely affected by the instructions [3] of other parts of the program, and it is difficult or even impossible to find the instructions affecting other programs.

In some special cases, violating I/O security can even lead to memory security.

I/O security concept introduction

There are some types and characteristics in the standard library: RawFd(Unix) / RawHandle/RawSocket(Windows), which represent the original operating system resource handle. These types themselves do not provide any behavior, but only represent identifiers that can be passed to the underlying operating system API.

These raw handles can be considered as raw pointers with similar risks. Although it is safe to obtain a raw pointer, dereferencing the raw pointer may call undefined behavior if a raw pointer is not a valid pointer, or if it exceeds the life cycle of the memory it points to.

Similarly, it is safe to obtain an original handle through AsRawFd::as_raw_fd and similar methods, but if it is not a valid handle or is used after its resources are closed, using it for I/O may lead to output corruption, input data loss or leakage, or violation of encapsulation boundary. In both cases, the impact may be non local and affect the program Other irrelevant parts. The protection against the danger of the original pointer is called memory security, so the protection against the danger of the original handle is called I/O security.

Rust's standard library also has some advanced types, such as File and TcpStream. They are the wrappers of these original handles and provide high-level interfaces to the operating system API.

These advanced types also implement the features of FromRawFd on UNIX like platform and FromRawHandle/FromRawSocket on Windows. These features provide functions that wrap low-level values to generate high-level values. These functions are unsafe because they cannot guarantee I/O security, and the type system does not limit the incoming handles.

use std::fs::File;
use std::os::unix::io::FromRawFd;

// Create a file.
let file = File::open("data.txt")?;

// Construct a file from an arbitrary integer value
// However, this type of check may not recognize a legally viable resource at run time
// Or it may be accidentally encapsulated as an alias elsewhere in the program (which cannot be determined here)
// The unsafe block is added here to allow the caller to avoid the above dangers
let forged = unsafe { File::from_raw_fd(7) };

// Obtain a copy of `file`'s inner raw handle.
let raw_fd = file.as_raw_fd();

// Close `file`.
drop(file);

// Open some unrelated file.
let another = File::open("another.txt")?;

// Further use of raw_fd, that is, the internal raw handle of file, will exceed the operating system's related life cycle
// This may cause it to be accidentally aliased with other encapsulated file instances, such as another  
// Therefore, the unsafe block here allows the caller to avoid the above dangers
let dangling = unsafe { File::from_raw_fd(raw_fd) };

The caller must ensure that the value passed in from_raw_fd is explicitly returned from the operating system, and that the return value of from_raw_fd does not exceed the handle related life cycle of the operating system.

Although the concept of I/O security is new, it reflects a common practice. The Rust ecosystem will gradually support I/O security.

I/O security Rust solution

OwnedFd and borrowedfd <'fd >

These two types are used to replace RawFd, give ownership semantics to handle values, and represent the ownership and borrowing of handle values.

OwnedFd has an FD, which will be closed when destructed. The lifecycle parameter in borrowedfd <'fd > indicates how long the access to this FD is borrowed.

For Windows, there are similar types, but they are in the form of Handle and Socket.

type

be similar to

OwnedFd

Box<_>

BorrowedFd<'a>

&'a _

RawFd

*const _

Compared with other types, I/O types do not distinguish between variable and immutable. Operating system resources can be shared in various ways outside Rust's control, so I/O can be considered to use internal variability.

AsFd, into < ownedfd > and from < ownedfd >

These three concepts are conceptual alternatives to AsRawFd::as_raw_fd, IntoRawFd::into_raw_fd, and FromRawFd::from_raw_fd, which are applicable to most use cases respectively. They work in the way of OwnedFd and BorrowedFd, so they automatically perform their I/O security invariance.

pub fn do_some_io<FD: AsFd>(input: &FD) -> io::Result<()> {
    some_syscall(input.as_fd())
}

Using this type will avoid the previous problem. Since AsFd is only implemented for types that properly own or borrow their file descriptors, this version of do_some_io does not have to worry about passing fake or dangling file descriptors.

Gradual adoption

I/O security and new types and features do not need to be adopted immediately, but can be adopted gradually.

  • First, std adds new types and attributes to all related std types and provides impls. This is a backward compatible change.
  • After that, crite can start using new types and implement new characteristics for their own types. These changes will be small and semi compatible without special coordination.
  • Once the standard library and enough popular crates implement the new traits, crates can start using the new traits at their own pace as the boundary when accepting common parameters. These will be changes incompatible with semver, although most users who switch to the API of these new traits do not need any changes.

Prototype implementation

The RFC content prototype has been implemented. See IO lifetimes [4].

Raw API

This experimental API

Raw*

Borrowed* and Owned*

AsRaw*

As*

IntoRaw*

Into*

FromRaw*

From*

trait implementation

AsFd is converted to native fd, which is a borrowedfd < '>

#[cfg(any(unix, target_os = "wasi"))]
pub trait AsFd {
    /// Borrows the file descriptor.
    ///
    /// # Example
    ///
    /// ```rust,no_run
    /// # #![cfg_attr(io_lifetimes_use_std, feature(io_safety))]
    /// use std::fs::File;
    /// # use std::io;
    /// use io_lifetimes::{AsFd, BorrowedFd};
    ///
    /// let mut f = File::open("foo.txt")?;
    /// let borrowed_fd: BorrowedFd<'_> = f.as_fd();
    /// # Ok::<(), io::Error>(())
    /// ```
    fn as_fd(&self) -> BorrowedFd<'_>;
}

IntoFd changes from native fd to secure fd, which is OwnedFd

#[cfg(any(unix, target_os = "wasi"))]
pub trait IntoFd {
    /// Consumes this object, returning the underlying file descriptor.
    ///
    /// # Example
    ///
    /// ```rust,no_run
    /// # #![cfg_attr(io_lifetimes_use_std, feature(io_safety))]
    /// use std::fs::File;
    /// # use std::io;
    /// use io_lifetimes::{IntoFd, OwnedFd};
    ///
    /// let f = File::open("foo.txt")?;
    /// let owned_fd: OwnedFd = f.into_fd();
    /// # Ok::<(), io::Error>(())
    /// ```
    fn into_fd(self) -> OwnedFd;
}

FromFd construct OwnedFd from native fd

#[cfg(any(unix, target_os = "wasi"))]
pub trait FromFd {
    /// Constructs a new instance of `Self` from the given file descriptor.
    ///
    /// # Example
    ///
    /// ```rust,no_run
    /// # #![cfg_attr(io_lifetimes_use_std, feature(io_safety))]
    /// use std::fs::File;
    /// # use std::io;
    /// use io_lifetimes::{FromFd, IntoFd, OwnedFd};
    ///
    /// let f = File::open("foo.txt")?;
    /// let owned_fd: OwnedFd = f.into_fd();
    /// let f = File::from_fd(owned_fd);
    /// # Ok::<(), io::Error>(())
    /// ```
    fn from_fd(owned: OwnedFd) -> Self;

    /// Constructs a new instance of `Self` from the given file descriptor
    /// converted from `into_owned`.
    ///
    /// # Example
    ///
    /// ```rust,no_run
    /// # #![cfg_attr(io_lifetimes_use_std, feature(io_safety))]
    /// use std::fs::File;
    /// # use std::io;
    /// use io_lifetimes::{FromFd, IntoFd};
    ///
    /// let f = File::open("foo.txt")?;
    /// let f = File::from_into_fd(f);
    /// # Ok::<(), io::Error>(())
    /// ```
    #[inline]
    fn from_into_fd<Owned: IntoFd>(into_owned: Owned) -> Self
    where
        Self: Sized,
    {
        Self::from_fd(into_owned.into_fd())
    }
}

The above is a trail for Unix platform. The library also contains relevant triat s for Windows Platform: AsHandle / AsSocket, IntoHandle /IntoSocket, FromHandle /FromSocket.

Related types

BorrowedFd<'fd>

#[cfg(any(unix, target_os = "wasi"))]
#[derive(Copy, Clone)]
#[repr(transparent)]
#[cfg_attr(rustc_attrs, rustc_layout_scalar_valid_range_start(0))]
// libstd/os/raw/mod.rs assures me that every libstd-supported platform has a
// 32-bit c_int. Below is -2, in two's complement, but that only works out
// because c_int is 32 bits.
#[cfg_attr(rustc_attrs, rustc_layout_scalar_valid_range_end(0xFF_FF_FF_FE))]
pub struct BorrowedFd<'fd> {
    fd: RawFd,
    _phantom: PhantomData<&'fd OwnedFd>,
}

#[cfg(any(unix, target_os = "wasi"))]
#[repr(transparent)]
#[cfg_attr(rustc_attrs, rustc_layout_scalar_valid_range_start(0))]
// libstd/os/raw/mod.rs assures me that every libstd-supported platform has a
// 32-bit c_int. Below is -2, in two's complement, but that only works out
// because c_int is 32 bits.
#[cfg_attr(rustc_attrs, rustc_layout_scalar_valid_range_end(0xFF_FF_FF_FE))]
pub struct OwnedFd {
    fd: RawFd,
}

#[cfg(any(unix, target_os = "wasi"))]
impl BorrowedFd<'_> {
    /// Return a `BorrowedFd` holding the given raw file descriptor.
    ///
    /// # Safety
    ///
    /// The resource pointed to by `raw` must remain open for the duration of
    /// the returned `BorrowedFd`, and it must not have the value `-1`.
    #[inline]
    pub unsafe fn borrow_raw_fd(fd: RawFd) -> Self {
        debug_assert_ne!(fd, -1_i32 as RawFd);
        Self {
            fd,
            _phantom: PhantomData,
        }
    }
}

#[cfg(any(unix, target_os = "wasi"))]
impl AsRawFd for BorrowedFd<'_> {
    #[inline]
    fn as_raw_fd(&self) -> RawFd {
        self.fd
    }
}

#[cfg(any(unix, target_os = "wasi"))]
impl AsRawFd for OwnedFd {
    #[inline]
    fn as_raw_fd(&self) -> RawFd {
        self.fd
    }
}

#[cfg(any(unix, target_os = "wasi"))]
impl IntoRawFd for OwnedFd {
    #[inline]
    fn into_raw_fd(self) -> RawFd {
        let fd = self.fd;
        forget(self);
        fd
    }
}

#[cfg(any(unix, target_os = "wasi"))]
impl Drop for OwnedFd {
    #[inline]
    fn drop(&mut self) {
        #[cfg(feature = "close")]
        unsafe {
            let _ = libc::close(self.fd as std::os::raw::c_int);
        }

        // If the `close` feature is disabled, we expect users to avoid letting
        // `OwnedFd` instances drop, so that we don't have to call `close`.
        #[cfg(not(feature = "close"))]
        {
            unreachable!("drop called without the \"close\" feature in io-lifetimes");
        }
    }
}


Support secure I/O for std and other ecosystems

After building some cross platform Abstract types, it is ffi / async_std/ fs_err/ mio/ os_pipe/ socket2/ tokio / std to support secure I/O abstraction.

Use case

// From: https://github.com/sunfishcode/io-lifetimes/blob/main/examples/hello.rs

#[cfg(all(rustc_attrs, unix, feature = "close"))]
fn main() -> io::Result<()> {
    // write is a c api, so use unsafe
    let fd = unsafe {
        // Open a file, which returns an `Option<OwnedFd>`, which we can
        // maybe convert into an `OwnedFile`.
        // Have a fd
        let fd: OwnedFd = open("/dev/stdout\0".as_ptr() as *const _, O_WRONLY | O_CLOEXEC)
            .ok_or_else(io::Error::last_os_error)?;

        // Borrow the fd to write to it.
        // Borrow this fd 
        let result = write(fd.as_fd(), "hello, world\n".as_ptr() as *const _, 13);
        match result {
            -1 => return Err(io::Error::last_os_error()),
            13 => (),
            _ => return Err(io::Error::new(io::ErrorKind::Other, "short write")),
        }

        fd
    };

    // Convert into a `File`. No `unsafe` here!
    // Unsafe is no longer needed here
    let mut file = File::from_fd(fd);
    writeln!(&mut file, "greetings, y'all")?;

    // We can borrow a `BorrowedFd` from a `File`.
    unsafe {
        // Borrow fd
        let result = write(file.as_fd(), "sup?\n".as_ptr() as *const _, 5);
        match result {
            -1 => return Err(io::Error::last_os_error()),
            5 => (),
            _ => return Err(io::Error::new(io::ErrorKind::Other, "short write")),
        }
    }

    // Now back to `OwnedFd`.
    let fd = file.into_fd();

    // It's not necessary. It will automatically destruct fd 
    unsafe {
        // This isn't needed, since `fd` is owned and would close itself on
        // drop automatically, but it makes a nice demo of passing an `OwnedFd`
        // into an FFI call.
        close(fd);
    }

    Ok(())
}

Rationale and alternatives

About the saying that "unsafe is for memory security"

Rust drew a line in history, pointing out that unsafe is only used for memory security. A well-known example is std::mem::forget, which is added as unsafe and later changed to safe.

The conclusion that unsafe is only used for memory security indicates that unsafe should not be used for APIs of other non memory security classes, such as indicating that an API should be avoided.

Memory security takes precedence over other defects because it is not just to avoid unexpected behavior, but to avoid situations where it is impossible to constrain what a piece of code may do.

I/O security also falls into this category for two reasons:

  1. I/O security errors cause memory security errors when the security wrappers around mmap exist (on platforms with operating system specific API s, they are allowed to be secure).
  2. I/O security errors also mean that a piece of code can read, write, or delete data used by other parts of the program without naming them or giving them a reference. Without knowing the implementation details of all other crates linked to the program, it is impossible to constrain the set of things a crate can do.

The original handle is much like the original pointer into a separate address space; They can be suspended or calculated in a false way. I/O security is similar to memory security; Both are designed to prevent weird remoteness, and in both, ownership is the main basis for robust abstraction, so it is natural to use similar security concepts.

relevant

  • https://github.com/smiller123/bento[5]
  • https://github.com/bytecodealliance/rsix[6]
  • RFC #3128 IO Safety[7]
  • RFC index list for nrc [8]

reference material

[1]

RFC : https://github.com/rust-lang/rfcs/blob/master/text/3128-io-safety.md

[2]

Anti mode: https://zh.wikipedia.org/wiki/ Anti pattern

[3]

Instruction: https://zh.wikipedia.org/wiki/ instructions

[4]

io-lifetimes: https://github.com/sunfishcode/io-lifetimes

[5]

https://github.com/smiller123/bento: https://github.com/smiller123/bento

[6]

https://github.com/bytecodealliance/rsix: https://github.com/bytecodealliance/rsix

[7]

RFC #3128 IO Safety: https://github.com/rust-lang/rfcs/blob/master/text/3128-io-safety.md

[8]

RFC index list for nrc: https://www.ncameron.org/rfcs/

Posted by volant on Wed, 10 Nov 2021 04:46:27 -0800