Is string slice mutable?
No, a string slice (&str) in Rust is always immutable. This is because it is a borrowed reference to a portion of a string, and references to data are immutable by default in Rust unless explicitly marked as mutable (&mut).
Here are some key points about string slices and mutability:
Why String Slices Are Immutable
- A string slice only provides a read-only view of the underlying string data.
- If you want to modify the string data, you must use the owning string type (
String) and modify it directly.
Example: Immutable String Slice
let s = String::from("hello");
let slice: &str = &s[0..2]; // Immutable string slice
// slice.push_str("!"); // ERROR: Cannot modify a string slice
Mutable References and Strings
If you want to modify the content of a string, you need a mutable reference to the owning string (String), not a string slice. For example:
let mut s = String::from("hello");
s.push_str(", world!"); // Modifies the String
println!("{}", s); // Outputs: hello, world!
What About &mut str?
A mutable string slice (&mut str) does exist in Rust but is rarely used directly. It allows modifying the contents of a string slice but only within the bounds of the slice and without changing its length. This is very niche and usually encountered when working with low-level or unsafe Rust code.
Example of &mut str (Very Rare)
fn modify(slice: &mut str) {
// Modify characters in the slice (must be of the same length)
let bytes = unsafe { slice.as_bytes_mut() }; // Convert to mutable bytes
bytes[0] = b'Y'; // Change 'h' to 'Y'
}
let mut s = String::from("hello");
let slice = &mut s[..]; // Borrow a mutable string slice
modify(slice);
println!("{}", s); // Outputs: Yello
Summary
- A string slice (
&str) is always immutable. - To modify a string, you need a mutable
String. - Mutable string slices (
&mut str) exist but are uncommon and require careful handling.
Is as_bytes_mut() dangerous?
If you modify a String’s internal value to contain invalid UTF-8, such as by using unsafe methods like as_bytes_mut(), you violate Rust’s safety guarantees, leading to undefined behavior (UB) if the invalid data is accessed later as a String.
Why This is Dangerous
StringGuarantees Valid UTF-8:- Rust’s
Stringtype is designed to always contain valid UTF-8. The compiler and standard library assume this invariant holds at all times.
- Rust’s
- Unsafe Methods Like
as_bytes_mut():String::as_bytes_mut()is an unsafe method because it allows you to mutate the underlying bytes of theString, bypassing Rust’s safety checks.- If you use it to introduce invalid UTF-8, subsequent operations on the
String(e.g., indexing, slicing, or calling.chars()) can result in UB.
Example of Creating Invalid UTF-8 with as_bytes_mut()
use std::str;
fn main() {
let mut s = String::from("hello");
unsafe {
let bytes = s.as_bytes_mut(); // Get mutable access to the underlying bytes
bytes[0] = 0xFF; // Modify the first byte to an invalid UTF-8 value
}
// Undefined behavior when accessed as a String
println!("{}", s); // This may crash, panic, or produce garbage output
}
Consequences
- Crashes or Panics: Some operations on the
Stringmay panic if invalid UTF-8 is detected during a check. - Garbage Data: You may see unexpected or garbage data when printing or processing the
String. - Undefined Behavior: Certain optimizations or assumptions about valid UTF-8 may cause memory corruption, crashes, or security vulnerabilities.
How to Avoid This
- Use
Vec<u8>for Raw Byte Manipulation:- If you need to manipulate raw bytes without UTF-8 validation, work with
Vec<u8>instead ofString. - Example:
let mut bytes = vec![0xFF, 0xFE, 0xFD]; // Arbitrary bytes bytes[0] = 0x41; // Modify bytes safely
- If you need to manipulate raw bytes without UTF-8 validation, work with
- Revalidate After Unsafe Modifications:
- If you must use unsafe methods like
as_bytes_mut(), ensure that the data remains valid UTF-8 before using it as aString. - Example:
use std::str; let mut s = String::from("hello"); unsafe { let bytes = s.as_bytes_mut(); bytes[0] = b'H'; // Valid UTF-8 } println!("{}", s); // Safe because data is still valid UTF-8
- If you must use unsafe methods like
- Avoid Unsafe Code When Possible:
- Most scenarios that require unsafe operations have safer alternatives in Rust.
Key Takeaway
Rust’s String is built with strong UTF-8 guarantees. Using unsafe methods to violate these guarantees can lead to undefined behavior, and it should be avoided unless absolutely necessary. If you must manipulate raw string bytes, use tools like Vec<u8> or validate the data rigorously after modification.