summaryrefslogtreecommitdiffstats
path: root/src/utf8proc/NEWS.md
blob: e3b651189ff68d668cf7575558e82a09a744306d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
# utf8proc release history #

## Version 2.2 ##

2018-07-24

- Unicode 11 support ([#132] and [#140]).

- `utf8proc_NFKC_Casefold` convenience function for `NFKC_Casefold`
  normalization ([#133]).

- `UTF8PROC_STRIPNA` option to strip unassigned codepoints ([#133]).

- Support building static libraries on Windows (callers need to
  `#define UTF8PROC_STATIC`) ([#123]).

- `cmake` fix to avoid defining `UTF8PROC_EXPORTS` globally ([#121]).

- `toupper` of ß (U+00df) now yields ẞ (U+1E9E) ([#134]), similar to musl;
  case-folding still yields the standard "ss" mapping.

- `utf8proc_charwidth` now returns `1` for U+00AD (soft hyphen) and
  for unassigned/PUA codepoints ([#135]).

## Version 2.1.1 ##

2018-04-27

- Fixed composition bug ([#128]).

- Minor build fixes ([#94], [#99], [#113], [#125]).

## Version 2.1 ##

2016-12-26:

- New functions `utf8proc_map_custom` and `utf8proc_decompose_custom`
  to allow user-supplied transformations of codepoints, in conjunction
  with other transformations ([#89]).

- New function `utf8proc_normalize_utf32` to apply normalizations
  directly to UTF-32 data (not just UTF-8) ([#88]).

- Fixed stack overflow that could occur due to incorrect definition
  of `UINT16_MAX` with some compilers ([#84]).

- Fixed conflict with `stdbool.h` in Visual Studio ([#90]).

- Updated font metrics to use Unifont 9.0.04.

## Version 2.0.2 ##

2016-07-27:

- Move `-Wmissing-prototypes` warning flag from `Makefile` to `.travis.yml`
  since MSVC does not understand this flag and it is occasionally useful to
  build using MSVC through the `Makefile` ([#79]).

- Use a different variable name for a nested loop in `bench/bench.c`, and
  declare it in a C89 way rather than inside the `for` to avoid "error:
  'for' loop initial declarations are only allowed in C99 mode" ([#80]).

## Version 2.0.1 ##

2016-07-13:

- Bug fix in `utf8proc_grapheme_break_stateful` ([#77]).

- Tests now use versioned Unicode files, so they will no longer
  break when a new version of Unicode is released ([#78]).

## Version 2.0 ##

2016-07-13:

- Updated for Unicode 9.0 ([#70]).

- New `utf8proc_grapheme_break_stateful` to handle the complicated
  grapheme-breaking rules in Unicode 9.  The old `utf8proc_grapheme_break`
  is still provided, but may incorrectly identify grapheme breaks
  in some Unicode-9 sequences.

- Smaller Unicode tables ([#62], [#68]).  This required changes
  in the `utf8proc_property_t` structure, which breaks backward
  compatibility if you access this `struct` directly.  The
  functions in the API remain backward-compatible, however.

- Buffer overrun fix ([#66]).

## Version 1.3.1 ##

2015-11-02:

- Do not export symbol for internal function `unsafe_encode_char()` ([#55]).

- Install relative symbolic links for shared libraries ([#58]).

- Enable and fix compiler warnings ([#55], [#58]).

- Add missing files to `make clean` ([#58]).

## Version 1.3 ##

2015-07-06:

- Updated for Unicode 8.0 ([#45]).

- New `utf8proc_tolower` and `utf8proc_toupper` functions, portable
  replacements for `towlower` and `towupper` in the C library ([#40]).

- Don't treat Unicode "non-characters" as invalid, and improved
  validity checking in general ([#35]).

- Prefix all typedefs with `utf8proc_`, e.g. `utf8proc_int32_t`,
  to avoid collisions with other libraries ([#32]).

- Rename `DLLEXPORT` to `UTF8PROC_DLLEXPORT` to prevent collisions.

- Fix build breakage in the benchmark routines.

- More fine-grained Makefile variables (`PICFLAG` etcetera), so that
  compilation flags can be selectively overridden, and in particular
  so that `CFLAGS` can be changed without accidentally eliminating
  necessary flags like `-fPIC` and `-std=c99` ([#43]).

- Updated character-width tables based on Unifont 8.0.01 ([#51]) and
  the Unicode 8 character categories ([#47]).

## Version 1.2 ##

2015-03-28:

- Updated for Unicode 7.0 ([#6]).

- New function `utf8proc_grapheme_break(c1,c2)` that returns whether
  there is a grapheme break between `c1` and `c2` ([#20]).

- New function `utf8proc_charwidth(c)` that returns the number of
  column-positions that should be required for `c`; essentially a
  portable replacment for `wcwidth(c)` ([#27]).

- New function `utf8proc_category(c)` that returns the Unicode
  category of `c` (as one of the constants `UTF8PROC_CATEGORY_xx`).
  Also, a function `utf8proc_category_string(c)` that returns the Unicode
  category of `c` as a two-character string.

- `cmake` script `CMakeLists.txt`, in addition to `Makefile`, for
  easier compilation on Windows ([#28]).

- Various `Makefile` improvements: a `make check` target to perform
  tests ([#13]), `make install`, a rule to automate updating the Unicode
  tables, etcetera.

- The shared library is now versioned (e.g. has a soname on GNU/Linux) ([#24]).

- C++/MSVC compatibility ([#17]).

- Most `#defined` constants are now `enums` ([#29]).

- New preprocessor constants `UTF8PROC_VERSION_MAJOR`,
  `UTF8PROC_VERSION_MINOR`, and `UTF8PROC_VERSION_PATCH` for compile-time
  detection of the API version.

- Doxygen-formatted documentation ([#29]).

- The Ruby and PostgreSQL plugins have been removed due to lack of testing ([#22]).

## Version 1.1.6 ##

2013-11-27:

- PostgreSQL 9.2 and 9.3 compatibility (lowercase `c` language name)

## Version 1.1.5 ##

2009-08-20:

- Use `RSTRING_PTR()` and `RSTRING_LEN()` instead of `RSTRING()->ptr` and
  `RSTRING()->len` for ruby1.9 compatibility (and `#define` them, if not
  existent)

2009-10-02:

- Patches for compatibility with Microsoft Visual Studio

2009-10-08:

- Fixes to make utf8proc usable in C++ programs

2009-10-16:

## Version 1.1.4 ##

2009-06-14:

- replaced C++ style comments for compatibility reasons
- added typecasts to suppress compiler warnings
- removed redundant source files for ruby-gemfile generation

2009-08-19:

- Changed copyright notice for Public Software Group e. V.
- Minor changes in the `README` file

## Version 1.1.3 ##

2008-10-04:

- Added a function `utf8proc_version` returning a string containing the version
  number of the library.
- Included a target `libutf8proc.dylib` for MacOSX.

2009-05-01:
- PostgreSQL 8.3 compatibility (use of `SET_VARSIZE` macro)

## Version 1.1.2 ##

2007-07-25:

- Fixed a serious bug in the data file generator, which caused characters
  being treated incorrectly, when stripping default ignorable characters or
  calculating grapheme cluster boundaries.

## Version 1.1.1 ##

2007-06-25:

- Added a new PostgreSQL function `unistrip`, which behaves like `unifold`,
  but also removes all character marks (e.g. accents).

2007-07-22:

- Changed license from BSD to MIT style.
- Added a new function `utf8proc_codepoint_valid` to the C library.
- Changed compiler flags in `Makefile` from `-g -O0` to `-O2`
- The ruby script, which was used to build the `utf8proc_data.c` file, is now
  included in the distribution.

## Version 1.0.3 ##

2007-03-16:

- Fixed a bug in the ruby library, which caused an error, when splitting an
  empty string at grapheme cluster boundaries (method `String#utf8chars`).

## Version 1.0.2 ##

2006-09-21:

- included a check in `Integer#utf8`, which raises an exception, if the given
  code-point is invalid because of being too high (this was missing yet)

2006-12-26:

- added support for PostgreSQL version 8.2

## Version 1.0.1 ##

2006-09-20:

- included a gem file for the ruby version of the library

Release of version 1.0.1

## Version 1.0 ##

2006-09-17:

- added the `LUMP` option, which lumps certain characters together (see `lump.md`) (also used for the PostgreSQL `unifold` function)
- added the `STRIPMARK` option, which strips marking characters (or marks of composed characters)
- deprecated ruby method `String#char_ary` in favour of `String#utf8chars`

## Version 0.3 ##

2006-07-18:

- changed normalization from NFC to NFKC for postgresql unifold function

2006-08-04:

- added support to mark the beginning of a grapheme cluster with 0xFF (option: `CHARBOUND`)
- added the ruby method `String#chars`, which is returning an array of UTF-8 encoded grapheme clusters
- added `NLF2LF` transformation in postgresql `unifold` function
- added the `DECOMPOSE` option, if you neither use `COMPOSE` or `DECOMPOSE`, no normalization will be performed (different from previous versions)
- using integer constants rather than C-strings for character properties
- fixed (hopefully) a problem with the ruby library on Mac OS X, which occurred when compiler optimization was switched on

## Version 0.2 ##

2006-06-05:

- changed behaviour of PostgreSQL function to return NULL in case of invalid input, rather than raising an exceptional condition
- improved efficiency of PostgreSQL function (no transformation to C string is done)

2006-06-20:

- added -fpic compiler flag in Makefile
- fixed bug in the C code for the ruby library (usage of non-existent function)

## Version 0.1 ##

2006-06-02: initial release of version 0.1

[#6]: https://github.com/JuliaLang/utf8proc/issues/6
[#13]: https://github.com/JuliaLang/utf8proc/issues/13
[#17]: https://github.com/JuliaLang/utf8proc/issues/17
[#20]: https://github.com/JuliaLang/utf8proc/issues/20
[#22]: https://github.com/JuliaLang/utf8proc/issues/22
[#24]: https://github.com/JuliaLang/utf8proc/issues/24
[#27]: https://github.com/JuliaLang/utf8proc/issues/27
[#28]: https://github.com/JuliaLang/utf8proc/issues/28
[#29]: https://github.com/JuliaLang/utf8proc/issues/29
[#32]: https://github.com/JuliaLang/utf8proc/issues/32
[#35]: https://github.com/JuliaLang/utf8proc/issues/35
[#40]: https://github.com/JuliaLang/utf8proc/issues/40
[#43]: https://github.com/JuliaLang/utf8proc/issues/43
[#45]: https://github.com/JuliaLang/utf8proc/issues/45
[#47]: https://github.com/JuliaLang/utf8proc/issues/47
[#51]: https://github.com/JuliaLang/utf8proc/issues/51
[#55]: https://github.com/JuliaLang/utf8proc/issues/55
[#58]: https://github.com/JuliaLang/utf8proc/issues/58
[#62]: https://github.com/JuliaLang/utf8proc/issues/62
[#66]: https://github.com/JuliaLang/utf8proc/issues/66
[#68]: https://github.com/JuliaLang/utf8proc/issues/68
[#70]: https://github.com/JuliaLang/utf8proc/issues/70
[#77]: https://github.com/JuliaLang/utf8proc/issues/77
[#78]: https://github.com/JuliaLang/utf8proc/issues/78
[#79]: https://github.com/JuliaLang/utf8proc/issues/79
[#80]: https://github.com/JuliaLang/utf8proc/issues/80
[#84]: https://github.com/JuliaLang/utf8proc/issues/84
[#88]: https://github.com/JuliaLang/utf8proc/issues/88
[#89]: https://github.com/JuliaLang/utf8proc/issues/89
[#90]: https://github.com/JuliaLang/utf8proc/issues/90
[#94]: https://github.com/JuliaLang/utf8proc/issues/94
[#99]: https://github.com/JuliaLang/utf8proc/issues/99
[#113]: https://github.com/JuliaLang/utf8proc/issues/113
[#121]: https://github.com/JuliaLang/utf8proc/issues/121
[#123]: https://github.com/JuliaLang/utf8proc/issues/123
[#125]: https://github.com/JuliaLang/utf8proc/issues/125
[#128]: https://github.com/JuliaLang/utf8proc/issues/128
[#132]: https://github.com/JuliaLang/utf8proc/issues/132
[#133]: https://github.com/JuliaLang/utf8proc/issues/133
[#134]: https://github.com/JuliaLang/utf8proc/issues/134
[#135]: https://github.com/JuliaLang/utf8proc/issues/135
[#140]: https://github.com/JuliaLang/utf8proc/issues/140