blob: 3a65c8aa46dfc81a0a36d580321443b25dbf7c43 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
|
# unicode-bom
[![Build status](https://gitlab.com/philbooth/unicode-bom/badges/master/pipeline.svg)](https://gitlab.com/philbooth/unicode-bom/pipelines)
[![Crate status](https://img.shields.io/crates/v/unicode-bom.svg)](https://crates.io/crates/unicode_bom)
[![Downloads](https://img.shields.io/crates/d/unicode-bom.svg)](https://crates.io/crates/unicode_bom)
[![License](https://img.shields.io/crates/l/unicode-bom.svg)](https://www.apache.org/licenses/LICENSE-2.0)
[Unicode byte-order mark](https://en.wikipedia.org/wiki/Byte_order_mark) detection
for Rust projects.
* [What does it do?](#what-does-it-do)
* [What doesn't it do?](#what-doesnt-it-do)
* [How do I install it?](#how-do-i-install-it)
* [How do I use it?](#how-do-i-use-it)
* [How do I set up the build environment?](#how-do-i-set-up-the-build-environment)
* [Is there API documentation?](#is-there-api-documentation)
* [Is there a change log?](#is-there-a-change-log)
* [What license is it published under?](#what-license-is-it-published-under)
## What does it do?
`unicode-bom` will read
the first few bytes from
an array or a file on disk,
then determine whether
a byte-order mark is present.
## What doesn't it do?
It won't check the rest of the data
to determine whether it's actually valid
according to the indicated encoding.
## How do I install it?
Add it to your dependencies
in `Cargo.toml`:
```toml
[dependencies]
unicode-bom = "1"
```
## How do I use it?
For more detailed information
see the [API docs](https://philbooth.gitlab.io/unicode-bom/unicode_bom/),
but the general gist
is as follows:
```rust
use unicode_bom::Bom;
// The BOM can be parsed from a file on disk via the `FromStr` trait...
let bom: Bom = "foo.txt".parse().unwrap();
match bom {
Bom::Null => {
// No BOM was detected
}
Bom::Bocu1 => {
// BOCU-1 BOM was detected
}
Bom::Gb18030 => {
// GB 18030 BOM was detected
}
Bom::Scsu => {
// SCSU BOM was detected
}
Bom::UtfEbcdic => {
// UTF-EBCDIC BOM was detected
}
Bom::Utf1 => {
// UTF-1 BOM was detected
}
Bom::Utf7 => {
// UTF-7 BOM was detected
}
Bom::Utf8 => {
// UTF-8 BOM was detected
}
Bom::Utf16Be => {
// UTF-16 (big-endian) BOM was detected
}
Bom::Utf16Le => {
// UTF-16 (little-endian) BOM was detected
}
Bom::Utf32Be => {
// UTF-32 (big-endian) BOM was detected
}
Bom::Utf32Le => {
// UTF-32 (little-endian) BOM was detected
}
}
// ...or you can detect the BOM in a byte array
let bytes = [0u8, 0u8, 0xfeu8, 0xffu8];
let bom = Bom::from(&bytes[0..]);
assert_eq!(bom, Bom::Utf32Be);
assert_eq(bom.len(), 4);
```
## How do I set up the build environment?
If you don't already have Rust installed,
get that first using [`rustup`](https://rustup.rs/):
```
curl https://sh.rustup.rs -sSf | sh
```
Then you can build the project:
```
cargo b
```
And run the tests:
```
cargo t
```
## Is there API documentation?
[Yes](https://philbooth.gitlab.io/unicode-bom/unicode_bom/).
## Is there a change log?
[Yes](HISTORY.md).
## What license is it published under?
[Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0).
|