Declare the tinyvec_string feature semver-excluded
[sanitise-file-name] / tests / misc.url_safe.sanitised
1 These_are_miscellaneous_tests_of_my_own_division.
2 _
3 hello_world
4 The_quick_brown_fox_jumps_over_the_lazy_doggerel.txt
5 well_____then.how__about_____this
6 Once_upon_a_time_there_was_a_file_name_sanitiser;_it_was_a_good_file_name_sanitiser,_and_never_exposed_security_vulnerabilities_to_the_World._“It’s_a_dangerous_place,”_its_grandf.“If_a_wolf_should_come_out_of_the_forest,_then_what_would_you_do_”
7 (Some_Peter_and_the_Wolf_snuck_in_there.)
8 .hidden
9 C__WINDOWS_system32_driver_etc_hosts
10 WINDIR__system32_driver_etc_hosts
11 Kinda_funny_how_Windows_has_a__etc_hosts.
12 _
13 _
14 _
15 _
16 _
17 _
18 I’m_basically_just_typing_random_stuff_here.
19 OK,_time_for_some_more_serious_stuff.
20 _
21 For_Unicode_paths,_some_file_systems_limit_paths_to_roughly_255_UTF-8_code_units,_others_to_roughly_255_UTF-16_code_units._UTF-8_is_the_tighter_of_these_restrictions_in_all_circumstances__UTF-16_uses_one_code_unit_until_U+FFF.Now_then__one-byte_characters
22 One-byte_characters
23 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
24 12345678901234567890.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
25 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678.abcdefghijklmnopqrstuvwxyz
26 Two-byte_characters
27 áɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠ
28 áɓç.°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³
29 áɓçđéƒɠ.°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³
30 áɓçđéƒɠɦïķá.°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³°¹²³
31 áɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓçđéƒɠɦïķáɓç.°¹²³
32 Three-byte_characters
33 ‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑‒–—―‖‗‘’‚‛“
34 ‐‑‒–—.₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉₀₁₂₃₄₅₆₇₈₉
35 ‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑‒–—―‖‗‘’‚‛“”„‟†‡•‣․‥…‧‐‑.₁₂₃₄₅₆₇₈₉₀
36 Four-byte_characters
37 𐀀𐀁𐀂𐀃𐀄𐀅𐀆𐀇𐀈𐀉𐀊𐀋𐀍𐀎𐀏𐀐𐀑𐀒𐀓𐀔𐀕𐀖𐀗𐀘𐀙𐀚𐀛𐀜𐀝𐀞𐀟𐀠𐀡𐀢𐀣𐀤𐀥𐀦𐀨𐀩𐀪𐀫𐀬𐀭𐀮𐀯𐀰𐀱𐀲𐀳𐀴𐀵𐀶𐀷𐀸𐀹𐀺𐀼𐀽𐀿𐁀𐁁𐁂
38 𐀀𐀁𐀂𐀃𐀄𐀅𐀆.𐂀𐂁𐂂𐂃𐂄𐂅𐂆𐂇𐂈𐂉𐂊𐂋𐂌𐂍𐂀𐂁𐂂𐂃𐂄𐂅𐂆𐂇𐂈𐂉𐂊𐂋𐂌𐂍𐂀𐂁𐂂𐂃𐂄𐂅𐂆𐂇𐂈𐂉𐂊𐂋𐂌𐂍𐂀𐂁𐂂𐂃𐂄𐂅𐂆𐂇𐂈𐂉𐂊𐂋𐂌𐂍
39 𐀀𐀁𐀂𐀃𐀄𐀅𐀆𐀇𐀈𐀉𐀊𐀋𐀍𐀎𐀏𐀐𐀑𐀒𐀓𐀔𐀕𐀖𐀗𐀘𐀙𐀚𐀛𐀜𐀝𐀞𐀟𐀠𐀡𐀢𐀣𐀤𐀥𐀦𐀨𐀩𐀪𐀫𐀬𐀭𐀮𐀯𐀰𐀱𐀲.𐂀𐂁𐂂𐂃𐂄𐂅𐂆𐂇𐂈𐂉𐂊𐂋𐂌𐂍
40 _
41 abcdef.ghij
42 abcde.fghij
43 AUX_.abcdef
44 lpT7_.abcdef
45 cOm6_.abcdef
46 CON_
47 aux_.h
48 Lpt1_.exe
49 xyz
50 nül
51 COM1.jpg.png
52 _
53 Some_sanitisers_try_stripping_out_ZWSP_(​),_which_can_be_used_as_a_fingerprinting_vector_and_has_no_particularly_legitimate_purpose_in_a_file_name;_I’m_not,_because_removing_it_doesn’t_solve_the_fingerprinting_risk,_as_you_can_use_ZWNJ_and_ZWJ_(.)