Adding upstream version 1.64.0+dfsg1.upstream/1.64.0+dfsg1

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-17 12:02:58 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-17 12:02:58 +0000
commit: 698f8c2f01ea549d77d7dc3338a12e04c11057b9 (patch)
tree: 173a775858bd501c378080a10dca74132f05bc50 /vendor/bstr/README.md
parent: Initial commit. (diff)
download: rustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.tar.xz
rustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.zip
1 files changed, 251 insertions, 0 deletions
diff --git a/vendor/bstr/README.md b/vendor/bstr/README.md
new file mode 100644
index 000000000..13bf0fc71
--- /dev/null
+++ b/vendor/bstr/README.md
@@ -0,0 +1,251 @@
+bstr
+====
+This crate provides extension traits for `&[u8]` and `Vec<u8>` that enable
+their use as byte strings, where byte strings are _conventionally_ UTF-8. This
+differs from the standard library's `String` and `str` types in that they are
+not required to be valid UTF-8, but may be fully or partially valid UTF-8.
+
+[![Build status](https://github.com/BurntSushi/bstr/workflows/ci/badge.svg)](https://github.com/BurntSushi/bstr/actions)
+[![](https://meritbadge.herokuapp.com/bstr)](https://crates.io/crates/bstr)
+
+
+### Documentation
+
+https://docs.rs/bstr
+
+
+### When should I use byte strings?
+
+See this part of the documentation for more details:
+https://docs.rs/bstr/0.2.*/bstr/#when-should-i-use-byte-strings.
+
+The short story is that byte strings are useful when it is inconvenient or
+incorrect to require valid UTF-8.
+
+
+### Usage
+
+Add this to your `Cargo.toml`:
+
+```toml
+[dependencies]
+bstr = "0.2"
+```
+
+
+### Examples
+
+The following two examples exhibit both the API features of byte strings and
+the I/O convenience functions provided for reading line-by-line quickly.
+
+This first example simply shows how to efficiently iterate over lines in
+stdin, and print out lines containing a particular substring:
+
+```rust
+use std::error::Error;
+use std::io::{self, Write};
+
+use bstr::{ByteSlice, io::BufReadExt};
+
+fn main() -> Result<(), Box<dyn Error>> {
+    let stdin = io::stdin();
+    let mut stdout = io::BufWriter::new(io::stdout());
+
+    stdin.lock().for_byte_line_with_terminator(|line| {
+        if line.contains_str("Dimension") {
+            stdout.write_all(line)?;
+        }
+        Ok(true)
+    })?;
+    Ok(())
+}
+```
+
+This example shows how to count all of the words (Unicode-aware) in stdin,
+line-by-line:
+
+```rust
+use std::error::Error;
+use std::io;
+
+use bstr::{ByteSlice, io::BufReadExt};
+
+fn main() -> Result<(), Box<dyn Error>> {
+    let stdin = io::stdin();
+    let mut words = 0;
+    stdin.lock().for_byte_line_with_terminator(|line| {
+        words += line.words().count();
+        Ok(true)
+    })?;
+    println!("{}", words);
+    Ok(())
+}
+```
+
+This example shows how to convert a stream on stdin to uppercase without
+performing UTF-8 validation _and_ amortizing allocation. On standard ASCII
+text, this is quite a bit faster than what you can (easily) do with standard
+library APIs. (N.B. Any invalid UTF-8 bytes are passed through unchanged.)
+
+```rust
+use std::error::Error;
+use std::io::{self, Write};
+
+use bstr::{ByteSlice, io::BufReadExt};
+
+fn main() -> Result<(), Box<dyn Error>> {
+    let stdin = io::stdin();
+    let mut stdout = io::BufWriter::new(io::stdout());
+
+    let mut upper = vec![];
+    stdin.lock().for_byte_line_with_terminator(|line| {
+        upper.clear();
+        line.to_uppercase_into(&mut upper);
+        stdout.write_all(&upper)?;
+        Ok(true)
+    })?;
+    Ok(())
+}
+```
+
+This example shows how to extract the first 10 visual characters (as grapheme
+clusters) from each line, where invalid UTF-8 sequences are generally treated
+as a single character and are passed through correctly:
+
+```rust
+use std::error::Error;
+use std::io::{self, Write};
+
+use bstr::{ByteSlice, io::BufReadExt};
+
+fn main() -> Result<(), Box<dyn Error>> {
+    let stdin = io::stdin();
+    let mut stdout = io::BufWriter::new(io::stdout());
+
+    stdin.lock().for_byte_line_with_terminator(|line| {
+        let end = line
+            .grapheme_indices()
+            .map(|(_, end, _)| end)
+            .take(10)
+            .last()
+            .unwrap_or(line.len());
+        stdout.write_all(line[..end].trim_end())?;
+        stdout.write_all(b"\n")?;
+        Ok(true)
+    })?;
+    Ok(())
+}
+```
+
+
+### Cargo features
+
+This crates comes with a few features that control standard library, serde
+and Unicode support.
+
+* `std` - **Enabled** by default. This provides APIs that require the standard
+  library, such as `Vec<u8>`.
+* `unicode` - **Enabled** by default. This provides APIs that require sizable
+  Unicode data compiled into the binary. This includes, but is not limited to,
+  grapheme/word/sentence segmenters. When this is disabled, basic support such
+  as UTF-8 decoding is still included.
+* `serde1` - **Disabled** by default. Enables implementations of serde traits
+  for the `BStr` and `BString` types.
+* `serde1-nostd` - **Disabled** by default. Enables implementations of serde
+  traits for the `BStr` type only, intended for use without the standard
+  library. Generally, you either want `serde1` or `serde1-nostd`, not both.
+
+
+### Minimum Rust version policy
+
+This crate's minimum supported `rustc` version (MSRV) is `1.41.1`.
+
+In general, this crate will be conservative with respect to the minimum
+supported version of Rust. MSRV may be bumped in minor version releases.
+
+
+### Future work
+
+Since this is meant to be a core crate, getting a `1.0` release is a priority.
+My hope is to move to `1.0` within the next year and commit to its API so that
+`bstr` can be used as a public dependency.
+
+A large part of the API surface area was taken from the standard library, so
+from an API design perspective, a good portion of this crate should be on solid
+ground already. The main differences from the standard library are in how the
+various substring search routines work. The standard library provides generic
+infrastructure for supporting different types of searches with a single method,
+where as this library prefers to define new methods for each type of search and
+drop the generic infrastructure.
+
+Some _probable_ future considerations for APIs include, but are not limited to:
+
+* A convenience layer on top of the `aho-corasick` crate.
+* Unicode normalization.
+* More sophisticated support for dealing with Unicode case, perhaps by
+  combining the use cases supported by [`caseless`](https://docs.rs/caseless)
+  and [`unicase`](https://docs.rs/unicase).
+* Add facilities for dealing with OS strings and file paths, probably via
+  simple conversion routines.
+
+Here are some examples that are _probably_ out of scope for this crate:
+
+* Regular expressions.
+* Unicode collation.
+
+The exact scope isn't quite clear, but I expect we can iterate on it.
+
+In general, as stated below, this crate brings lots of related APIs together
+into a single crate while simultaneously attempting to keep the total number of
+dependencies low. Indeed, every dependency of `bstr`, except for `memchr`, is
+optional.
+
+
+### High level motivation
+
+Strictly speaking, the `bstr` crate provides very little that can't already be
+achieved with the standard library `Vec<u8>`/`&[u8]` APIs and the ecosystem of
+library crates. For example:
+
+* The standard library's
+  [`Utf8Error`](https://doc.rust-lang.org/std/str/struct.Utf8Error.html)
+  can be used for incremental lossy decoding of `&[u8]`.
+* The
+  [`unicode-segmentation`](https://unicode-rs.github.io/unicode-segmentation/unicode_segmentation/index.html)
+  crate can be used for iterating over graphemes (or words), but is only
+  implemented for `&str` types. One could use `Utf8Error` above to implement
+  grapheme iteration with the same semantics as what `bstr` provides (automatic
+  Unicode replacement codepoint substitution).
+* The [`twoway`](https://docs.rs/twoway) crate can be used for
+  fast substring searching on `&[u8]`.
+
+So why create `bstr`? Part of the point of the `bstr` crate is to provide a
+uniform API of coupled components instead of relying on users to piece together
+loosely coupled components from the crate ecosystem. For example, if you wanted
+to perform a search and replace in a `Vec<u8>`, then writing the code to do
+that with the `twoway` crate is not that difficult, but it's still additional
+glue code you have to write. This work adds up depending on what you're doing.
+Consider, for example, trimming and splitting, along with their different
+variants.
+
+In other words, `bstr` is partially a way of pushing back against the
+micro-crate ecosystem that appears to be evolving. Namely, it is a goal of
+`bstr` to keep its dependency list lightweight. For example, `serde` is an
+optional dependency because there is no feasible alternative. In service of
+this philosophy, currently, the only required dependency of `bstr` is `memchr`.
+
+
+### License
+
+This project is licensed under either of
+
+ * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
+   https://www.apache.org/licenses/LICENSE-2.0)
+ * MIT license ([LICENSE-MIT](LICENSE-MIT) or
+   https://opensource.org/licenses/MIT)
+
+at your option.
+
+The data in `src/unicode/data/` is licensed under the Unicode License Agreement
+([LICENSE-UNICODE](https://www.unicode.org/copyright.html#License)), although
+this data is only used in tests.
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-17 12:02:58 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-17 12:02:58 +0000
commit	698f8c2f01ea549d77d7dc3338a12e04c11057b9 (patch)
tree	173a775858bd501c378080a10dca74132f05bc50 /vendor/bstr/README.md
parent	Initial commit. (diff)
download	rustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.tar.xz rustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.zip