Skip to content

std::io: vectored reads with uninitialized memory (Read::read_buf_vec) #104

@nrc

Description

@nrc

Proposal

This is an implementation of part of RFC 2930 (read_buf) which was not deeply specified in the RFC

Problem statement

Implement Read::read_buf_vec.

Motivation, use-cases

Vectored IO without having to initialise all the buffers first.

Solution sketches

This mostly follows BorrowedBuf/BorrowedCursor, extended to slices for vectored IO. We need a new version of IoSliceMut which does not assume that all data in it is initialised. Note that this is an improvement over the read_vec API because the underlying buffers can be reused.

An alternative would be to use the old single buffer design, but this has problems with a complex API and a soundness footgun. We could also pass a slice of BorrowedCursors, but this is pretty horrible ergonomically.

/// A buffer type used with `Read::read_buf_vectored`. Unlike `IoSliceMut`, there is no guarantee
/// that its memory has been initialised.
///
/// It is semantically a wrapper around an &mut [MaybeUninit<u8>], but is guaranteed to be ABI
/// compatible with the `iovec` type on Unix platforms and WSABUF on Windows.
pub struct IoSliceMaybeUninit<'a> {}

impl Debug for IoSliceMaybeUninit<'_> {}

/// Create a new `IoSliceMaybeUninit` from an uninitialized buffer.
impl<'a> From<&'a mut [MaybeUninit<u8>]> for IoSliceMaybeUninit<'a> {
    fn from(buf: &'a mut [MaybeUninit<u8>]) -> IoSliceMaybeUninit<'a> {}
}

impl<'a> IoSliceMaybeUninit<'a> {
    /// Create a new IoSliceMaybeUninit from an existing mutable slice of `u8`s.
    ///
    /// SAFETY: all bytes in the slice must be initialized by the time the IoSliceMaybeUninit is
    /// destroyed and thus access to the data is restored for `slice` or other views of the data.
    pub unsafe fn from_slice(slice: &'a mut [u8]) -> IoSliceMaybeUninit<'a> {}

    /// Create a new IoSliceMaybeUninit with its internal cursor advanced.
    pub fn advance(&self, n: usize) -> Self {}

    /// View the slice as a slice of `u8`s.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all elements of the slice have been initialized.
    pub unsafe fn as_slice(&self) -> &[u8] {}

    /// View the slice as a mutable slice of `u8`s.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all elements of the slice have been initialized.
    pub unsafe fn as_mut_slice(&mut self) -> &mut [u8] {}

    /// View the slice as a mutable slice of `MaybeUninit<u8>`.
    pub fn as_maybe_init_slice(&mut self) -> &mut [MaybeUninit<u8>] {}

    /// Returns the number of elements in the slice.
    pub fn len(&self) -> usize {}
}

/// A borrowed byte buffer, consisting of multiple underlying buffers, which is incrementally filled
/// and initialized. Primarily designed for vectored IO.
///
/// This type is a sort of "double cursor". It tracks three regions in the buffer: a region at the beginning of the
/// buffer that has been logically filled with data, a region that has been initialized at some point but not yet
/// logically filled, and a region at the end that is fully uninitialized. The filled region is guaranteed to be a
/// subset of the initialized region.
///
/// In summary, the contents of the buffer can be visualized as:
/// ```not_rust
/// [   |   |      |    |   |       |   ] Underlying buffers
/// [             capacity              ]
/// [ filled |         unfilled         ]
/// [    initialized    | uninitialized ]
/// ```
///
/// A `BorrowedSliceBuf` is created around some existing data (or capacity for data) via a unique reference
/// (`&mut`). The `BorrowedSliceBuf` can be configured (e.g., using `clear` or `set_init`), but otherwise
/// is read-only. To write into the buffer, use `unfilled` to create a `BorrowedSliceCursor`. The cursor
/// has write-only access to the unfilled portion of the buffer (you can think of it as a
/// write-only iterator).
///
/// The lifetime `'a` is a bound on the lifetime of the underlying data.
pub struct BorrowedSliceBuf<'a> {}

impl<'a> BorrowedSliceBuf<'a> {
    /// Create a new `BorrowedSliceBuf` from a slice of possibly initialized io slices.
    pub fn new<'b: 'a>(bufs: &'a mut [IoSliceMaybeUninit<'b>]) -> BorrowedSliceBuf<'a> {}

    /// Returns the length of the filled part of the buffer.
    pub fn len(&self) -> usize {}

    /// Returns the number of completely filled slices in the buffer.
    pub fn len_filled_slices(&self) -> usize {}

    /// Returns the number of filled elements in any partially filled slice.
    ///
    /// If there are no partially filled slices, then this method returns `0`.
    pub fn len_partial_filled_slice(&self) -> usize {}

    /// Iterate over the filled portion of the buffer.
    pub fn iter_filled_slices(&self) -> FilledSliceIterator<'_, 'a> {}

    /// Returns a cursor over the unfilled part of the buffer.
    pub fn unfilled<'this>(&'this mut self) -> BorrowedSliceCursor<'this> {}

    /// Clears the buffer, resetting the filled region to empty.
    ///
    /// The number of initialized bytes is not changed, and the contents of the buffer are not modified.
    pub fn clear(&mut self) -> &mut Self {}

    /// Asserts that a prefix of the underlying buffers are initialized. The initialized prefix is
    /// all of the first `b - 1` buffers and the first `n` bytes of the `b`th buffer. In other words,
    /// `(b, n)` is the coordinates of the first uninitialized byte in the buffers.
    ///
    /// `BorrowedSliceBuf` assumes that bytes are never de-initialized, so this method does nothing when called with fewer
    /// bytes than are already known to be initialized.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all of the `(b, n)` prefix has already been initialized.
    pub unsafe fn set_init(&mut self, b: usize, n: usize) -> &mut Self {}
}

/// A writeable view of the unfilled portion of a [`BorrowedSliceBuf`](BorrowedSliceBuf).
///
/// Provides access to the initialized and uninitialized parts of the underlying `BorrowedSliceBuf`.
/// Data can be written directly to the cursor by using [`append`](BorrowedSliceCursor::append) or
/// indirectly by writing into a view of the cursor (obtained by calling `as_mut`, `next_init_mut`,
/// etc.) and then calling `advance`.
///
/// Once data is written to the cursor, it becomes part of the filled portion of the underlying
/// `BorrowedSliceBuf` and can no longer be accessed or re-written by the cursor. I.e., the cursor tracks
/// the unfilled part of the underlying `BorrowedSliceBuf`.
///
/// The lifetime `'a` is a bound on the lifetime of the underlying data.
pub struct BorrowedSliceCursor<'a> {}

impl<'a> BorrowedSliceCursor<'a> {
    /// Clone this cursor.
    ///
    /// Since a cursor maintains unique access to its underlying buffer, the cloned cursor is not
    /// accessible while the clone is alive.
    pub fn reborrow<'this>(&'this mut self) -> BorrowedSliceCursor<'this> {}

    /// Returns the available space in the cursor.
    pub fn capacity(&self) -> usize {}

    /// Returns the number of bytes written to this cursor since it was created from a `BorrowBuf`.
    ///
    /// Note that if this cursor is a reborrow of another, then the count returned is the count written
    /// via either cursor, not the count since the cursor was reborrowed.
    pub fn written(&self) -> usize {}

    /// Returns a mutable reference to the whole cursor.
    ///
    /// Returns a guard type which dereferences to a `&mut [IoSliceMaybeUninit<'a>]`
    ///
    /// # Safety
    ///
    /// The caller must not uninitialize any bytes in the initialized portion of the cursor.
    pub unsafe fn as_mut<'this>(&'this mut self) -> BorrowedSliceGuard<'this, 'a> {}

    /// Returns a shared reference to the initialized portion of the first (at least partially)
    /// initialised buffer in the cursor.
    ///
    /// Returns a reference to a slice of a single underlying buffer. That buffer will be the first
    /// unfilled buffer which is at least partially initialized. The returned slice is the part of
    /// that buffer which is initialized. If there is no part of any buffer which is both unfilled
    /// and initialised, then this method returns `None`.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_init_ref(&self) -> Option<&[u8]> {}

    /// Returns a mutable reference to the initialized portion of the first (at least partially)
    /// initialised buffer in the cursor.
    ///
    /// Returns a reference to a slice of a single underlying buffer. That buffer will be the first
    /// unfilled buffer which is at least partially initialized. The returned slice is the part of
    /// that buffer which is initialized. If there is no part of any buffer which is both unfilled
    /// and initialised, then this method returns `None`.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_init_mut(&mut self) -> Option<&mut [u8]> {}

    /// Returns a mutable reference to the uninitialized portion of the first (at least partially)
    /// uninitialised buffer in the cursor.
    ///
    /// It is safe to uninitialize any of these bytes.
    ///
    /// Returns `None` if `self` is entirely initialized.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_uninit_mut(&mut self) -> Option<&mut [MaybeUninit<u8>]> {}

    /// Returns a mutable reference to the first buffer in the cursor.
    ///
    /// Returns `None` if `self` is empty.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is filled (e.g., by calling `advance`).
    ///
    /// # Safety
    ///
    /// The caller must not uninitialize any bytes in the initialized portion of the cursor.
    pub unsafe fn next_mut(&mut self) -> Option<&mut [MaybeUninit<u8>]> {}

    /// Initializes all bytes in the cursor.
    pub fn ensure_init(&mut self) -> &mut Self {}

    /// Initializes all bytes in the first (at least partially unfilled) buffer in the cursor.
    pub fn ensure_next_init(&mut self) -> &mut Self {}

    /// Advance the cursor by asserting that `n` bytes have been filled.
    ///
    /// After advancing, the `n` bytes are no longer accessible via the cursor and can only be
    /// accessed via the underlying `BorrowedSliceBuf`. I.e., the `BorrowedSliceBuf`'s filled portion
    /// grows by `n` elements and its unfilled portion (and the capacity of this cursor) shrinks by
    /// `n` elements.
    ///
    /// # Safety
    ///
    /// The caller must ensure that the first `n` bytes of the cursor have been properly
    /// initialised.
    pub unsafe fn advance(&mut self, mut n: usize) -> &mut Self {}

    /// Appends data to the cursor, advancing position within its buffer.
    ///
    /// # Panics
    ///
    /// Panics if `self.capacity()` is less than `buf.len()`.
    pub fn append(&mut self, mut buf: &[u8]) {}

    /// Sets the initialized region to the minimum of the currently initialized region
    /// and the filled region.
    ///
    /// The caller must ensure all invariants for `self.bufs.filled` are satisfied (see the field definition)
    ///
    /// # Safety
    ///
    /// The caller must ensure that the filled region is entirely initialized.
    unsafe fn update_init_to_filled(&mut self) {}
}

impl<'a> Write for BorrowedSliceCursor<'a> {
    fn write(&mut self, buf: &[u8]) -> Result<usize> {}
    fn flush(&mut self) -> Result<()> {}
}

/// An iterator over the filled slices of a `BorrowedSliceBuf`.
///
/// See `BorrowedSliceBuf::iter_filled`.
pub struct FilledSliceIterator<'buf, 'data> {}

impl<'buf, 'data> Iterator for FilledSliceIterator<'buf, 'data> {
    type Item = &'buf [u8];
    fn next(&mut self) -> Option<&'buf [u8]>
}

/// Guard type used by `BorrowedSliceCursor::as_mut`.
///
/// Presents a view of the cursor containing only the filled data (via the `Deref` impls). Also
/// resets the state of the underlying BorrowedSliceBuf to a view of the complete
/// buffer when dropped.
pub struct BorrowedSliceGuard<'buf, 'data> {}

impl<'buf, 'data> Drop for BorrowedSliceGuard<'buf, 'data> {}
impl<'buf, 'data> Deref for BorrowedSliceGuard<'buf, 'data> {
    type Target = [IoSliceMaybeUninit<'data>];
    fn deref(&self) -> &[IoSliceMaybeUninit<'data>] {}
}
impl<'buf, 'data> DerefMut for BorrowedSliceGuard<'buf, 'data> {}

Links and related work

Draft PR with incomplete implementation: rust-lang/rust#101842

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions