|
||
---|---|---|
.. | ||
README.md | ||
TestNames_Japanese_h.txt | ||
TestNames_Japanese_k.txt | ||
TestNames_Korean.txt | ||
TestNames_Latin.txt | ||
TestNames_Thai.txt | ||
TestRandomWordsUDHR_ar.txt | ||
TestRandomWordsUDHR_de.txt | ||
TestRandomWordsUDHR_el.txt | ||
TestRandomWordsUDHR_es.txt | ||
TestRandomWordsUDHR_fr.txt | ||
TestRandomWordsUDHR_he.txt | ||
TestRandomWordsUDHR_pl.txt | ||
TestRandomWordsUDHR_ru.txt | ||
TestRandomWordsUDHR_th.txt | ||
TestRandomWordsUDHR_tr.txt | ||
wotw.txt |
Generating microbench data
The full versions of these files are located in another part of the repository.
Sanitizing the file
sed -i '/^#/d' ${filename}
sed -i '/^$/d' ${filename}
Shuffling the file
shuf -n 20 ${filename} -o ${filename}
Add back the header (if you plan on submitting the files)
# This file is part of ICU4X. For terms of use, please see the file
# called LICENSE at the top level of the ICU4X source tree
# (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).