summaryrefslogtreecommitdiffstats
path: root/vendor/bstr/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'vendor/bstr/README.md')
-rw-r--r--vendor/bstr/README.md251
1 files changed, 0 insertions, 251 deletions
diff --git a/vendor/bstr/README.md b/vendor/bstr/README.md
deleted file mode 100644
index 13bf0fc71..000000000
--- a/vendor/bstr/README.md
+++ /dev/null
@@ -1,251 +0,0 @@
-bstr
-====
-This crate provides extension traits for `&[u8]` and `Vec<u8>` that enable
-their use as byte strings, where byte strings are _conventionally_ UTF-8. This
-differs from the standard library's `String` and `str` types in that they are
-not required to be valid UTF-8, but may be fully or partially valid UTF-8.
-
-[![Build status](https://github.com/BurntSushi/bstr/workflows/ci/badge.svg)](https://github.com/BurntSushi/bstr/actions)
-[![](https://meritbadge.herokuapp.com/bstr)](https://crates.io/crates/bstr)
-
-
-### Documentation
-
-https://docs.rs/bstr
-
-
-### When should I use byte strings?
-
-See this part of the documentation for more details:
-https://docs.rs/bstr/0.2.*/bstr/#when-should-i-use-byte-strings.
-
-The short story is that byte strings are useful when it is inconvenient or
-incorrect to require valid UTF-8.
-
-
-### Usage
-
-Add this to your `Cargo.toml`:
-
-```toml
-[dependencies]
-bstr = "0.2"
-```
-
-
-### Examples
-
-The following two examples exhibit both the API features of byte strings and
-the I/O convenience functions provided for reading line-by-line quickly.
-
-This first example simply shows how to efficiently iterate over lines in
-stdin, and print out lines containing a particular substring:
-
-```rust
-use std::error::Error;
-use std::io::{self, Write};
-
-use bstr::{ByteSlice, io::BufReadExt};
-
-fn main() -> Result<(), Box<dyn Error>> {
- let stdin = io::stdin();
- let mut stdout = io::BufWriter::new(io::stdout());
-
- stdin.lock().for_byte_line_with_terminator(|line| {
- if line.contains_str("Dimension") {
- stdout.write_all(line)?;
- }
- Ok(true)
- })?;
- Ok(())
-}
-```
-
-This example shows how to count all of the words (Unicode-aware) in stdin,
-line-by-line:
-
-```rust
-use std::error::Error;
-use std::io;
-
-use bstr::{ByteSlice, io::BufReadExt};
-
-fn main() -> Result<(), Box<dyn Error>> {
- let stdin = io::stdin();
- let mut words = 0;
- stdin.lock().for_byte_line_with_terminator(|line| {
- words += line.words().count();
- Ok(true)
- })?;
- println!("{}", words);
- Ok(())
-}
-```
-
-This example shows how to convert a stream on stdin to uppercase without
-performing UTF-8 validation _and_ amortizing allocation. On standard ASCII
-text, this is quite a bit faster than what you can (easily) do with standard
-library APIs. (N.B. Any invalid UTF-8 bytes are passed through unchanged.)
-
-```rust
-use std::error::Error;
-use std::io::{self, Write};
-
-use bstr::{ByteSlice, io::BufReadExt};
-
-fn main() -> Result<(), Box<dyn Error>> {
- let stdin = io::stdin();
- let mut stdout = io::BufWriter::new(io::stdout());
-
- let mut upper = vec![];
- stdin.lock().for_byte_line_with_terminator(|line| {
- upper.clear();
- line.to_uppercase_into(&mut upper);
- stdout.write_all(&upper)?;
- Ok(true)
- })?;
- Ok(())
-}
-```
-
-This example shows how to extract the first 10 visual characters (as grapheme
-clusters) from each line, where invalid UTF-8 sequences are generally treated
-as a single character and are passed through correctly:
-
-```rust
-use std::error::Error;
-use std::io::{self, Write};
-
-use bstr::{ByteSlice, io::BufReadExt};
-
-fn main() -> Result<(), Box<dyn Error>> {
- let stdin = io::stdin();
- let mut stdout = io::BufWriter::new(io::stdout());
-
- stdin.lock().for_byte_line_with_terminator(|line| {
- let end = line
- .grapheme_indices()
- .map(|(_, end, _)| end)
- .take(10)
- .last()
- .unwrap_or(line.len());
- stdout.write_all(line[..end].trim_end())?;
- stdout.write_all(b"\n")?;
- Ok(true)
- })?;
- Ok(())
-}
-```
-
-
-### Cargo features
-
-This crates comes with a few features that control standard library, serde
-and Unicode support.
-
-* `std` - **Enabled** by default. This provides APIs that require the standard
- library, such as `Vec<u8>`.
-* `unicode` - **Enabled** by default. This provides APIs that require sizable
- Unicode data compiled into the binary. This includes, but is not limited to,
- grapheme/word/sentence segmenters. When this is disabled, basic support such
- as UTF-8 decoding is still included.
-* `serde1` - **Disabled** by default. Enables implementations of serde traits
- for the `BStr` and `BString` types.
-* `serde1-nostd` - **Disabled** by default. Enables implementations of serde
- traits for the `BStr` type only, intended for use without the standard
- library. Generally, you either want `serde1` or `serde1-nostd`, not both.
-
-
-### Minimum Rust version policy
-
-This crate's minimum supported `rustc` version (MSRV) is `1.41.1`.
-
-In general, this crate will be conservative with respect to the minimum
-supported version of Rust. MSRV may be bumped in minor version releases.
-
-
-### Future work
-
-Since this is meant to be a core crate, getting a `1.0` release is a priority.
-My hope is to move to `1.0` within the next year and commit to its API so that
-`bstr` can be used as a public dependency.
-
-A large part of the API surface area was taken from the standard library, so
-from an API design perspective, a good portion of this crate should be on solid
-ground already. The main differences from the standard library are in how the
-various substring search routines work. The standard library provides generic
-infrastructure for supporting different types of searches with a single method,
-where as this library prefers to define new methods for each type of search and
-drop the generic infrastructure.
-
-Some _probable_ future considerations for APIs include, but are not limited to:
-
-* A convenience layer on top of the `aho-corasick` crate.
-* Unicode normalization.
-* More sophisticated support for dealing with Unicode case, perhaps by
- combining the use cases supported by [`caseless`](https://docs.rs/caseless)
- and [`unicase`](https://docs.rs/unicase).
-* Add facilities for dealing with OS strings and file paths, probably via
- simple conversion routines.
-
-Here are some examples that are _probably_ out of scope for this crate:
-
-* Regular expressions.
-* Unicode collation.
-
-The exact scope isn't quite clear, but I expect we can iterate on it.
-
-In general, as stated below, this crate brings lots of related APIs together
-into a single crate while simultaneously attempting to keep the total number of
-dependencies low. Indeed, every dependency of `bstr`, except for `memchr`, is
-optional.
-
-
-### High level motivation
-
-Strictly speaking, the `bstr` crate provides very little that can't already be
-achieved with the standard library `Vec<u8>`/`&[u8]` APIs and the ecosystem of
-library crates. For example:
-
-* The standard library's
- [`Utf8Error`](https://doc.rust-lang.org/std/str/struct.Utf8Error.html)
- can be used for incremental lossy decoding of `&[u8]`.
-* The
- [`unicode-segmentation`](https://unicode-rs.github.io/unicode-segmentation/unicode_segmentation/index.html)
- crate can be used for iterating over graphemes (or words), but is only
- implemented for `&str` types. One could use `Utf8Error` above to implement
- grapheme iteration with the same semantics as what `bstr` provides (automatic
- Unicode replacement codepoint substitution).
-* The [`twoway`](https://docs.rs/twoway) crate can be used for
- fast substring searching on `&[u8]`.
-
-So why create `bstr`? Part of the point of the `bstr` crate is to provide a
-uniform API of coupled components instead of relying on users to piece together
-loosely coupled components from the crate ecosystem. For example, if you wanted
-to perform a search and replace in a `Vec<u8>`, then writing the code to do
-that with the `twoway` crate is not that difficult, but it's still additional
-glue code you have to write. This work adds up depending on what you're doing.
-Consider, for example, trimming and splitting, along with their different
-variants.
-
-In other words, `bstr` is partially a way of pushing back against the
-micro-crate ecosystem that appears to be evolving. Namely, it is a goal of
-`bstr` to keep its dependency list lightweight. For example, `serde` is an
-optional dependency because there is no feasible alternative. In service of
-this philosophy, currently, the only required dependency of `bstr` is `memchr`.
-
-
-### License
-
-This project is licensed under either of
-
- * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
- https://www.apache.org/licenses/LICENSE-2.0)
- * MIT license ([LICENSE-MIT](LICENSE-MIT) or
- https://opensource.org/licenses/MIT)
-
-at your option.
-
-The data in `src/unicode/data/` is licensed under the Unicode License Agreement
-([LICENSE-UNICODE](https://www.unicode.org/copyright.html#License)), although
-this data is only used in tests.