Skip to the content.

Is string slice mutable?

No, a string slice (&str) in Rust is always immutable. This is because it is a borrowed reference to a portion of a string, and references to data are immutable by default in Rust unless explicitly marked as mutable (&mut).

Here are some key points about string slices and mutability:

Why String Slices Are Immutable

  1. A string slice only provides a read-only view of the underlying string data.
  2. If you want to modify the string data, you must use the owning string type (String) and modify it directly.

Example: Immutable String Slice

let s = String::from("hello");
let slice: &str = &s[0..2]; // Immutable string slice
// slice.push_str("!"); // ERROR: Cannot modify a string slice

Mutable References and Strings

If you want to modify the content of a string, you need a mutable reference to the owning string (String), not a string slice. For example:

let mut s = String::from("hello");
s.push_str(", world!"); // Modifies the String
println!("{}", s); // Outputs: hello, world!

What About &mut str?

A mutable string slice (&mut str) does exist in Rust but is rarely used directly. It allows modifying the contents of a string slice but only within the bounds of the slice and without changing its length. This is very niche and usually encountered when working with low-level or unsafe Rust code.

Example of &mut str (Very Rare)

fn modify(slice: &mut str) {
    // Modify characters in the slice (must be of the same length)
    let bytes = unsafe { slice.as_bytes_mut() }; // Convert to mutable bytes
    bytes[0] = b'Y'; // Change 'h' to 'Y'
}
let mut s = String::from("hello");
let slice = &mut s[..]; // Borrow a mutable string slice
modify(slice);
println!("{}", s); // Outputs: Yello

Summary

Is as_bytes_mut() dangerous?

If you modify a String’s internal value to contain invalid UTF-8, such as by using unsafe methods like as_bytes_mut(), you violate Rust’s safety guarantees, leading to undefined behavior (UB) if the invalid data is accessed later as a String.

Why This is Dangerous

  1. String Guarantees Valid UTF-8:
    • Rust’s String type is designed to always contain valid UTF-8. The compiler and standard library assume this invariant holds at all times.
  2. Unsafe Methods Like as_bytes_mut():
    • String::as_bytes_mut() is an unsafe method because it allows you to mutate the underlying bytes of the String, bypassing Rust’s safety checks.
    • If you use it to introduce invalid UTF-8, subsequent operations on the String (e.g., indexing, slicing, or calling .chars()) can result in UB.

Example of Creating Invalid UTF-8 with as_bytes_mut()

use std::str;

fn main() {
    let mut s = String::from("hello");
    
    unsafe {
        let bytes = s.as_bytes_mut(); // Get mutable access to the underlying bytes
        bytes[0] = 0xFF; // Modify the first byte to an invalid UTF-8 value
    }
    
    // Undefined behavior when accessed as a String
    println!("{}", s); // This may crash, panic, or produce garbage output
}

Consequences

How to Avoid This

  1. Use Vec<u8> for Raw Byte Manipulation:
    • If you need to manipulate raw bytes without UTF-8 validation, work with Vec<u8> instead of String.
    • Example:
      let mut bytes = vec![0xFF, 0xFE, 0xFD]; // Arbitrary bytes
      bytes[0] = 0x41; // Modify bytes safely
      
  2. Revalidate After Unsafe Modifications:
    • If you must use unsafe methods like as_bytes_mut(), ensure that the data remains valid UTF-8 before using it as a String.
    • Example:
      use std::str;
      
      let mut s = String::from("hello");
      
      unsafe {
          let bytes = s.as_bytes_mut();
          bytes[0] = b'H'; // Valid UTF-8
      }
      
      println!("{}", s); // Safe because data is still valid UTF-8
      
  3. Avoid Unsafe Code When Possible:
    • Most scenarios that require unsafe operations have safer alternatives in Rust.

Key Takeaway

Rust’s String is built with strong UTF-8 guarantees. Using unsafe methods to violate these guarantees can lead to undefined behavior, and it should be avoided unless absolutely necessary. If you must manipulate raw string bytes, use tools like Vec<u8> or validate the data rigorously after modification.