X-Git-Url: https://git.chrismorgan.info/crev-proofs/blobdiff_plain/7d768836b524bc5aab790a895329826c59c45c4e..07af1706aaa5c46133a6b7b252f524c0c33979df:/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev?ds=sidebyside diff --git a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev index 651a5e5..af932fd 100644 --- a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev +++ b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev @@ -28,3 +28,230 @@ comment: |- LJ2SYWIPHSNC49lmO1Bd4hGDAvT8lm-I4heOGrtdZPIRcDoOw2e9P4YxtKT_72r1Qwm3UfnpkCJqtM1UBkfkCw ----- END CREV PROOF ----- +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-08T02:06:31.277120223+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: sanitise-file-name + version: 1.0.0 + revision: 2f75e28d1f5d097ae5931fd05dd1b4fe39ec8c08 + digest: egqrezU_0kaH9V4t1GOLIZe9W2aKoGc-isplQ-9l3d0 +review: + thoroughness: high + understanding: high + rating: strong +comment: |- + I wrote this crate, with great care. (Sometimes it needed it!) + + It has unfortunately ended up with rather a lot of code, but a fair bit of + that is related to Doing Things Right. The calculating of *exactly* how many + bytes may be needed is painfully involved (again a part of Doing Things + Right, especially as regards extension cleverness), but I’ve reviewed it + multiple times on different days and each time convinced myself that it is in + fact correct. + + I believe that there are only two ways of causing this to panic, both easily + avoided by following instructions, and otherwise harmless. Firstly, giving + `sanitise_to` a `tinyvec_string::ArrayString` that doesn’t have enough space + remaining. That’s definite careless user error: `max_alloc_size_const` was + right there and clearly identified in the docs with explanation about the + whys and wherefores. Secondly, a slight variant of that: implementing + `Stringy` on some other type in a panicky fashion. + + (Writing this review led me to add a mention of the possible panic to + sanitise_to’s docs, but I don’t think it warrants a 1.0.1 release.) +----- SIGN CREV PROOF ----- +ymZLNHdzGamyaPo5JyBuSNUY398pyzpjsFMsup1knXtscSZumHP3ZgzNC8KPO7VCQ-23XnQxapenIX27x_DdDw +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-08T02:07:39.869637389+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: human-string-filler + version: 1.0.0 + revision: a02907505ed1b65bb08d6a77c3b9676ab6c07d02 + digest: ZfyKwrp12UPlfOiQPJ9qhbKYeoXy5Il9oJgL_ePwUlA +review: + thoroughness: high + understanding: high + rating: strong +comment: |- + I wrote this crate, carefully. It’s robust, well documented, well tested, and + entirely devoid of panics. +----- SIGN CREV PROOF ----- +uGwLNFZ_p6EV3AXkJYjJle9N4joGx7LkwEMkdONuNWlolx0UWuz9GVM9IqQrBbmdMorpDYkYSPW1wDxJJBuaAA +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-11T20:07:22.213361263+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: sanitize-filename + version: 0.3.0 + revision: ece3260eb1f7b959125e5ca8f25781b31e875934 + digest: WpQbem5BHnp6FnMIwNny-oY5xqc_7W0x4anVu2N-gnw +review: + thoroughness: high + understanding: high + rating: neutral +alternatives: + - source: "https://crates.io" + name: sanitise-file-name +comment: |- + This crate is straightforward and simple in a fairly good way, but it’s not + as efficient as it could be, has a couple of small but significant bugs, + has some defaults and behaviours that I reckon aren’t ideal. + + The good: + + • The code is simple. The actual sanitisation code is 27 lines + (not including the regular expression definitions). + + • If nothing needs to be changed about the name (which is probably the most + common case), exactly one precisely-sized allocation is made. (This is + definitely a positive side-effect of using Regex::replace_all rather than + the obvious standard library alternative, str::replace, which would have + led to a minimum of three or five allocations.) + + The not-so-good (that is, things that aren’t ideal but aren’t real problems): + + • Unnecessary dependencies: regex and lazy_static are used, but I reckon + they’re not warranted. A character iterator and some push(char) or + push_str(replacement), plus some trim_end_matches() at the end in lieu of + WINDOWS_TRAILING_RE, would do the same job probably faster—though I will + admit that such a naive implementation would be prone to allocate slightly + more than this does when no replacements are made (though starting with + `String::with_capacity(name.len())` would nigh completely counter this), + and a smarter implementation that deferred allocation would be a bit longer + and not quite so “obviously correct”. + + Mitigating this claim of unnecessary dependencies and heaviness, regex’s + default features have been turned off, which is good. + + • It strips more than is necessary: U+0080–U+009F are not reserved by any + file system or operating system I know of, but this crate is stripping + them; and the names COM0 and LPT0 are not reserved by Windows, but this + crate is stripping them. + + • Poor style on Regex and RegexBuilder: ILLEGAL_RE includes colon twice; + and WINDOWS_RESERVED_RE sets case insensitivity in two ways, `(?i)` and + `RegexBuilder::case_insensitive(true)` (so it could have just used + `Regex::new`). + + • It unnecessarily enables the unicode-case feature (which is heavy) on regex: + In its favour it did disable the default features, but unicode-case isn’t + needed, because WINDOWS_RESERVED_RE only matches ASCII. It would be better + to change (?i) to (?i-u) and disable the feature again. + + • The defaults depend on your platform: the default value for + Options::windows is true on Windows and false elsewhere; varying things + like this is risky, and I think a poor choice in this particular case, + because you could be writing to a device subject to Windows’ limitations + (e.g. a local NTFS partition, a FAT32 or exFAT USB disk, a SMB network + share), and the difference in what Windows requires is so slight (assuming + the excessive stripping mentioned in the next point). + + • With Options::windows false, it strips much more than is necessary: + ILLEGAL_RE is still applied, and it corresponds to a mix of NTFS/FAT32 and + Windows limitations; other platforms pretty much only need to worry about ␀ + (0x00) and / (0x2F). + + • Truncation operates on the entire name, not the base name: this is a + flaw found in almost all libraries like this; although it’s not + intrinsically a security vulnerability, it contributes to the critical + security vulnerability where you thought you checked the extension of a + file was OK, but then you sanitised it and thte extension changed. But + quite apart from the security aspect, file extensions changing unexpectedly + is a usability hazard. + + • Documentation-wise, it’s kind of a black box: the README is fair, but + doesn’t explain in detail any of what’s done, and there are no doc comments + on anything. When it turns "aux.h" into "", you’ll say, “wait, *what*!?” if + you don’t know the deal. + + The bad (that is, it’ll cause actual problems): + + • It misses a character that Microsoft says isn’t allowed: U+007F is supposed + to be stripped, but isn’t. + + • WINDOWS_TRAILING_RE is wrong: it should be stripping trailing spaces and + dots (`[. ]+$`), but is instead a duplicate of RESERVED_RE (`^\.+$`) and so + does nothing. + + My opinions: + + • Options::truncate shouldn’t be a bool with the number 255 hard-coded, + because it’s common to want to append an extension afterwards (especially + if you know about the truncation hazard!). + + • I think "" is a mildly poor default for Options::replacement; I think "_" + would be a better choice, so that “1 < 0? A treatise” + becomes “1 _ 0_ A treatise” rather than “1 0 A treatise”. + + • The treatment of reserved names is poor: they’re entirely replaced with the + replacement string, so by default "aux.h" will become "" (eh, I suppose + that’s not too awful), and with an underscore "aux.h" will become "_" (uh + oh, it lost its extension). I think a better choice is to alter the base + name; I’ve seen a double underscore prefix and an underscore prefix, and I + chose an underscore suffix in my crate. + + • There should also be an option to remove leading dots, which hide files on + most platforms, and are fairly problematic under Windows. + + • “Sanitise” looks so much better than “sanitize”. 😛 + + My recommendation: use my sanitise-file-name crate instead of this. + My crate’s code is certainly longer-winded because of its performance focus, + but it’s more thorough in its processing, with better filtering and better + defaults, quite apart from the (unbenchmarked) performance thing. + + sanitise-file-name doesn’t cover *precisely* the same functionality as this + crate, but the differences are generally minor and fairly clearly described. + Here are the option translations: + + sanitize_filename::Options.truncate == true (the default) + sanitise_file_name::Options.length_limit == 255 (the default) + + sanitize_filename::Options.truncate == false + sanitise_file_name::Options.length_limit == usize::MAX + + sanitize_filename::Options.windows (default cfg!(windows)) + sanitise_file_name::Options.windows_safe (default true) + + sanitize_filename::Options.replacement == "" (the default) + sanitise_file_name::Options.replace_with == None (NOT the default) + + sanitize_filename::Options.replacement == "_" (any one character) + sanitise_file_name::Options.replace_with == Some('_') (the default) + + sanitize_filename::Options.replacement == "string" (longer than one character) + (not supported in sanitise-file-name) + + I haven’t factored the sanitize-filename binary into my reckoning here. + Some may find it useful, but I don’t think it will often be useful as a + standalone binary. And the binary’s usage string is found in the README, but + is not emitted by the binary itself, which is odd. Kinda fits in with not + documenting anything inside the code files! +----- SIGN CREV PROOF ----- +16uMNqMzp_nrGOi5JyJro5meslEjUV7coVZ5qxyobQ6So-H8DrkbjBSOntYC-1fP2VthjzzdfvvB__i6KakhCw +----- END CREV PROOF ----- +