X-Git-Url: https://git.chrismorgan.info/crev-proofs/blobdiff_plain/792e69a0e48e99c08b83c491da1425cea99e0a99..HEAD:/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev diff --git a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev index 15b5620..9a0bca2 100644 --- a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev +++ b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev @@ -70,3 +70,468 @@ comment: |- ymZLNHdzGamyaPo5JyBuSNUY398pyzpjsFMsup1knXtscSZumHP3ZgzNC8KPO7VCQ-23XnQxapenIX27x_DdDw ----- END CREV PROOF ----- +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-08T02:07:39.869637389+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: human-string-filler + version: 1.0.0 + revision: a02907505ed1b65bb08d6a77c3b9676ab6c07d02 + digest: ZfyKwrp12UPlfOiQPJ9qhbKYeoXy5Il9oJgL_ePwUlA +review: + thoroughness: high + understanding: high + rating: strong +comment: |- + I wrote this crate, carefully. It’s robust, well documented, well tested, and + entirely devoid of panics. +----- SIGN CREV PROOF ----- +uGwLNFZ_p6EV3AXkJYjJle9N4joGx7LkwEMkdONuNWlolx0UWuz9GVM9IqQrBbmdMorpDYkYSPW1wDxJJBuaAA +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-11T20:07:22.213361263+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: sanitize-filename + version: 0.3.0 + revision: ece3260eb1f7b959125e5ca8f25781b31e875934 + digest: WpQbem5BHnp6FnMIwNny-oY5xqc_7W0x4anVu2N-gnw +review: + thoroughness: high + understanding: high + rating: neutral +alternatives: + - source: "https://crates.io" + name: sanitise-file-name +comment: |- + This crate is straightforward and simple in a fairly good way, but it’s not + as efficient as it could be, has a couple of small but significant bugs, + has some defaults and behaviours that I reckon aren’t ideal. + + The good: + + • The code is simple. The actual sanitisation code is 27 lines + (not including the regular expression definitions). + + • If nothing needs to be changed about the name (which is probably the most + common case), exactly one precisely-sized allocation is made. (This is + definitely a positive side-effect of using Regex::replace_all rather than + the obvious standard library alternative, str::replace, which would have + led to a minimum of three or five allocations.) + + The not-so-good (that is, things that aren’t ideal but aren’t real problems): + + • Unnecessary dependencies: regex and lazy_static are used, but I reckon + they’re not warranted. A character iterator and some push(char) or + push_str(replacement), plus some trim_end_matches() at the end in lieu of + WINDOWS_TRAILING_RE, would do the same job probably faster—though I will + admit that such a naive implementation would be prone to allocate slightly + more than this does when no replacements are made (though starting with + `String::with_capacity(name.len())` would nigh completely counter this), + and a smarter implementation that deferred allocation would be a bit longer + and not quite so “obviously correct”. + + Mitigating this claim of unnecessary dependencies and heaviness, regex’s + default features have been turned off, which is good. + + • It strips more than is necessary: U+0080–U+009F are not reserved by any + file system or operating system I know of, but this crate is stripping + them; and the names COM0 and LPT0 are not reserved by Windows, but this + crate is stripping them. + + • Poor style on Regex and RegexBuilder: ILLEGAL_RE includes colon twice; + and WINDOWS_RESERVED_RE sets case insensitivity in two ways, `(?i)` and + `RegexBuilder::case_insensitive(true)` (so it could have just used + `Regex::new`). + + • It unnecessarily enables the unicode-case feature (which is heavy) on regex: + In its favour it did disable the default features, but unicode-case isn’t + needed, because WINDOWS_RESERVED_RE only matches ASCII. It would be better + to change (?i) to (?i-u) and disable the feature again. + + • The defaults depend on your platform: the default value for + Options::windows is true on Windows and false elsewhere; varying things + like this is risky, and I think a poor choice in this particular case, + because you could be writing to a device subject to Windows’ limitations + (e.g. a local NTFS partition, a FAT32 or exFAT USB disk, a SMB network + share), and the difference in what Windows requires is so slight (assuming + the excessive stripping mentioned in the next point). + + • With Options::windows false, it strips much more than is necessary: + ILLEGAL_RE is still applied, and it corresponds to a mix of NTFS/FAT32 and + Windows limitations; other platforms pretty much only need to worry about ␀ + (0x00) and / (0x2F). + + • Truncation operates on the entire name, not the base name: this is a + flaw found in almost all libraries like this; although it’s not + intrinsically a security vulnerability, it contributes to the critical + security vulnerability where you thought you checked the extension of a + file was OK, but then you sanitised it and thte extension changed. But + quite apart from the security aspect, file extensions changing unexpectedly + is a usability hazard. + + • Documentation-wise, it’s kind of a black box: the README is fair, but + doesn’t explain in detail any of what’s done, and there are no doc comments + on anything. When it turns "aux.h" into "", you’ll say, “wait, *what*!?” if + you don’t know the deal. + + The bad (that is, it’ll cause actual problems): + + • It misses a character that Microsoft says isn’t allowed: U+007F is supposed + to be stripped, but isn’t. + + • WINDOWS_TRAILING_RE is wrong: it should be stripping trailing spaces and + dots (`[. ]+$`), but is instead a duplicate of RESERVED_RE (`^\.+$`) and so + does nothing. + + My opinions: + + • Options::truncate shouldn’t be a bool with the number 255 hard-coded, + because it’s common to want to append an extension afterwards (especially + if you know about the truncation hazard!). + + • I think "" is a mildly poor default for Options::replacement; I think "_" + would be a better choice, so that “1 < 0? A treatise” + becomes “1 _ 0_ A treatise” rather than “1 0 A treatise”. + + • The treatment of reserved names is poor: they’re entirely replaced with the + replacement string, so by default "aux.h" will become "" (eh, I suppose + that’s not too awful), and with an underscore "aux.h" will become "_" (uh + oh, it lost its extension). I think a better choice is to alter the base + name; I’ve seen a double underscore prefix and an underscore prefix, and I + chose an underscore suffix in my crate. + + • There should also be an option to remove leading dots, which hide files on + most platforms, and are fairly problematic under Windows. + + • “Sanitise” looks so much better than “sanitize”. 😛 + + My recommendation: use my sanitise-file-name crate instead of this. + My crate’s code is certainly longer-winded because of its performance focus, + but it’s more thorough in its processing, with better filtering and better + defaults, quite apart from the (unbenchmarked) performance thing. + + sanitise-file-name doesn’t cover *precisely* the same functionality as this + crate, but the differences are generally minor and fairly clearly described. + Here are the option translations: + + sanitize_filename::Options.truncate == true (the default) + sanitise_file_name::Options.length_limit == 255 (the default) + + sanitize_filename::Options.truncate == false + sanitise_file_name::Options.length_limit == usize::MAX + + sanitize_filename::Options.windows (default cfg!(windows)) + sanitise_file_name::Options.windows_safe (default true) + + sanitize_filename::Options.replacement == "" (the default) + sanitise_file_name::Options.replace_with == None (NOT the default) + + sanitize_filename::Options.replacement == "_" (any one character) + sanitise_file_name::Options.replace_with == Some('_') (the default) + + sanitize_filename::Options.replacement == "string" (longer than one character) + (not supported in sanitise-file-name) + + I haven’t factored the sanitize-filename binary into my reckoning here. + Some may find it useful, but I don’t think it will often be useful as a + standalone binary. And the binary’s usage string is found in the README, but + is not emitted by the binary itself, which is odd. Kinda fits in with not + documenting anything inside the code files! +----- SIGN CREV PROOF ----- +16uMNqMzp_nrGOi5JyJro5meslEjUV7coVZ5qxyobQ6So-H8DrkbjBSOntYC-1fP2VthjzzdfvvB__i6KakhCw +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-12T05:37:47.367354687+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: sanitize-filename-reader-friendly + version: 2.2.1 + revision: 245eca36c588cb70f907e2167ecee93b0fd54d33 + digest: f6Pot3fE2Kp4y5hrg6gx34D4YFt6oJGexGqMNwT2yko +review: + thoroughness: high + understanding: high + rating: negative +alternatives: + - source: "https://crates.io" + name: sanitise-file-name +comment: |- + This crate does approximately what it’s supposed to, but it’s overengineered + in a bad way (complicated for no good reason), confusingly and insufficiently + documented, contains some perplexing functionality that I can’t think of any + good reason for the existence of, is not as efficient as it could be, and + misses a couple of fairly significant things in a file name sanitiser that + mean that what it returns might still not work. + + I recommend my own crate sanitise-file-name instead, which is unfortunately + at least as complex, but for the good reasons of runtime efficiency and it + actually caring about extensions and correctness, and far better documented + and explained. + + The good: + + • It does what it says on the tin. The sanitisation it performs *is* fairly + reader-friendly (though its rules are more convoluted than I think is + warranted). + + The not-so-good: + + • There’s no name length limitation: this is actually a little subjective, + because implemented and used carelessly it leads to file extension change + attacks (which in the right environment could allow remote code execution), + but it’s generally expected. + + • It strips or replaces some characters it probably shouldn’t: + + • For example, U+200B is branded a non-printing character and removed. For + starters, there’s no such concept in Unicode; the closest is control + characters, and U+200B ZERO-WIDTH SPACE isn’t one of those. Removing it + serves no security purpose (unlike most of the Bidi_Control characters), + and will mangle some Unicode text where ZWSP is significant. You could + argue over it being a fingerprinting vector, but in that case why aren’t + ZWNJ and ZWJ filtered out, and after that how about confusables, &c. + So I say there’s just no good reason to remove U+200B. + + • It replaces various characters that no file system objects to, such as ~ + (replaced with _) and #, %, {, }, ^ and ` (replaced with space). + Most of those can be genuinely useful characters in file names. + + • Some of the rules are fairly arbitrary, and some excessive: the decisions + about what gets turned into a space, what gets turned into an underscore, + and what gets removed are idiosyncratic with no explanation, and a couple + of them seem a little odd. Not harmful in any way, just… why those + particular rules? FILTER_ORIG_AFTER_LAST_PROCESSED_WAS_WHITESPACE is + perhaps the most bizarre and troubling: drop a few characters, including + `.`, the extension separator, if they come after a space. Remember my + earlier point about length limitation potentially opening a security + vulnerability? Well, this is a similar sort of issue, though unlikely to be + exploitable, that you can change the extension significantly: + `foo.exe .jpg` becomes `foo.exe jpg`. + + I think it’s telling, philosophically, that the library completely ignores + extensions. It’s not worrying about security. (In the tests, only two or + three test cases have extensions, and those are ones copied from elsewhere.) + + The bad: + + • It’s inefficient: an absolute minimum of five allocations, and in practice + it’ll tend to be a few more. (It wouldn’t be *terribly* difficult to make + it perform exactly one allocation in all cases, like I did. A couple of the + allocations are almost trivially removable.) Also it uses str.find(char) to + check if characters can be removed, on strings as long as 10 chars/30 code + units, which is much less efficient than a match block and probably less + efficient even than using [char], while also writing boolean conditions in + a way that doesn’t take advantage of short-circuiting + (`STRING.find(c_orig).is_some() && last_processed_chr == '&'`, which would + be better flipped to do the cheap char comparison first). + + • It’s confusing with some surprising scope: it’s not sanitising simple + names, but rather some kind of thing where you can put parts on multiple + lines and it’ll sanitise them individually and join them up and I can’t + think of any serious reason why you’d actually want this—the potential + benefits are negligible, and it makes it much more confusing. + + • Its code style is inconsistent: useful comments occasionally, superfluous + comments sometimes but not other times, erroneous comments sometimes, a doc + comment with a spurious asterisk in the middle. + + • It overuses constants and its documentation is poorly prepared: it’s + written in an idiosyncratic private const-heavy style that makes reading + the code hard and doesn’t translate to rustdoc at all, so you have to read + the code to understand what’s actually going on, and the code’s overly + complicated, so it takes quite some time to figure out what’s going on, + constantly having to jump between the code and the consts as well. + Actually, it looks like the author *tried* to make it rustdoc-friendly by + pulling in const_format::concatcp! (clever compile-time magic) and exposing + a couple of public consts, but quite apart from them being incomplete (as + documented), it doesn’t even work because rustdoc doesn’t include the + values of the consts in its output. (In theory perhaps the constants could + be used by dependers as well, but I can’t imagine any genuine use case.) + + • It doesn’t do all it needs to: Windows reserved names are a TODO in the + code. I contributed the essence of a patch, but no action has been taken + after some weeks. (Actually, I started patching it further to make it more + efficient, then decided I just wanted to start from scratch, and so I made + sanitise-file-name.) + + • I have a problem with its entire philosphical approach, given the domain: + as noted in the previous section, length limitations, not even + acknowledging extensions are a thing and doing dangerous things in + consequence, it’s just not the right approach for sanitising file names, an + area where you should be caring about security. Ultimately, this is why I’m + giving a rating of negative rather than neutral as I originally intended. + + My recommendation: avoid this crate. It’s not awful by any means, but it’s + not well-written either, and my crate sanitise-file-name provides much the + same functionality, but better. Its default options are quite similar to this + crate’s, but more thorough. It doesn’t offer the strangne multi-line thing + this crate does. +----- SIGN CREV PROOF ----- +C30SLGF6ibx4QaLXo6uazHuIio1iP_rSmab7Vo6nLEeFZfuoaFTqiaMhPYflhvcGHEu6jImRBBIvmworq13bBg +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-13T00:42:12.174512813+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: cfg-if + version: 1.0.0 + revision: e60fa1efeab0ec6e90c50d93ec526e1410459c23 + digest: YmRVto2mw7U7GJhrOe13k_5Sgg9TcQWK1NuLN2Qiz5E +review: + thoroughness: high + understanding: high + rating: positive +comment: |- + This is a robust crate, clear, simple and well-documented within and without, + powered by a well-designed tt gobbler, and written by Alex Crichton of + exemplar reputation. + + I’m quite fond of the ``$(if … { … })else* else { … }`` trick, using the + keyword ``else`` as the repetition delimiter. I knew you could put a variety + of things there, not just characters like comma or semicolon, but I hadn’t + connected the dots that that would extended to keywords. Hey, you can even + use *string literals*! Crazy stuff. Anyway, that’s a neat way of keeping the + public API to two rules rather than three as I would probably have done. + + I found one surprise in this: ``cfg_if! { else { true } }`` works, yielding + true, because that repetition I just showed used ``*`` instead of ``+``. + It’s not covered by the test suite, but it’s harmless. + confirms it’s unintended. + + Since the 1.0.0 release, there have been [a few commits + ]. Most are minor + trivialities (yeah, I’ll even call Dependabot that, though I hate it and it’s + not even useful here because this crate has no dependencies), but + + is significant and very mildly concerning: as well as twiddling some + formatting (for the good), it changes that ``*`` to ``+`` to deny bare else, + which is a breaking change, for unlikely but reasonable and plausible code. + I’ve commented on #23 and #39 to this effect; we’ll see what comes of it. + (It could be reverted, it could be ignored, it could be released as 2.0.0.) + + There are a couple of extraneous files in the package (.gitignore, + .github/workflows/main.yml), but they’re small enough that I won’t complain. + + I think this crate is overused and most places it’s used would be better + served by writing the negations themselves, which are often no longer and + often clearer. +----- SIGN CREV PROOF ----- +ex_TABmVr_CTEY6B5C-NDKk7ciKdCK1bLf6ltjpXyoVknvyAC2AnRjA5rlPX_5ORQMeHmzJ2UxWUR0RdgBjYAQ +----- END CREV PROOF ----- + +----- BEGIN CREV PROOF ----- +kind: package review +version: -1 +date: "2022-01-14T02:22:52.039384447+11:00" +from: + id-type: crev + id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU + url: "https://git.chrismorgan.info/crev-proofs" +package: + source: "https://crates.io" + name: same-file + version: 1.0.6 + revision: 5799cd323b8eefd17a089c950dac113f66c89c9e + digest: Yf7qyYw3W3WoHK4t74epRoJkZehfhWh-Hh-R0LfB8UQ +review: + thoroughness: high + understanding: high + rating: neutral +comment: |- + The good: + + ✓ Decently tested, including CI coverage of the declared MSRV (1.34.0). (As + noted in src/lib.rs lines 485–491, edge case tests are difficult to + synthesise; Andrew doesn’t know how to create those environmental + conditions and I don’t really either. I suppose it’d entail reading file + system specs and creating suitable file systems from scratch, and mounting + them—not terribly difficult, but tedious work for limited gain.) + + ✓ The crate only supports its *functionality* on cfg(any(unix, windows)), but + still provides a sane fallback on cfg(not(any(unix, windows))) so that all + methods returns an explanatory I/O error. + + ✓ It’s unfortunate that false positives can arise, but given that it’s + unavoidable, it is pleasingly well-document. + + ✓ Cargo.toml excludes CI files from the package (keep things small!) + + The neutral: + + • The I/O error returned on unknown platforms is of kind Other. I look + forward to an MSRV bump to 1.53.0 so that Unsupported can be used. + + • The code has some cfg(target_os = "redox") that can be removed when the + MSRV reaches Rust 1.38.0 (that’s when redox was moved to the unix family, + in ). + + • Andrew uses his preferred Unlicense/MIT combo; [I have problems with + Unlicense ]. + + • src/win.rs lines 17–32 acknowledge a straightforwardly-remediable gap that, + if resolved, would by the sound of it eliminate false positives on at least + the Windows Server + ReFS combo. This is something that someone that cared + could contribute fairly easily, with access to Windows Server 2012 or + later. (Microsoft’s documentation of FILE_ID_INFO *says* it’s Server-only, + but given that ReFS is supported from Windows 8.1 onwards as well, I wonder + if it might be *de facto* available elsewhere too.) + + • Handle::{stdin, stdout, stderr} bypass std::io::{Stdin, Stdout, Stderr} + locking. I don’t much like this, but given that avoiding it would be very + painful to the API, and that locking isn’t a memory-safety matter, already + not being inviolable due to foreign code (though the std::io docs don’t + seem to mention that), so I think this was the pragmatic choice. + + The bad: + + ✗ The file size comparison that is documented in two places as being there on + Windows as a false positive avoidance heuristic is accidentally missing. + This was reported and acknowledged as an error on 2021-05-19 in + , but it still hasn’t + been reinstated or the comments removed. + + ✗ Handle::{dev, ino} are Unix-specific methods; it would be better not to + expose them on the type itself, but to move them to a platform-specific + extension trait to make it more explicit, e.g. HandleUnixExt. + + ✗ std::os::unix::io::IntoRawFd implementation panics on Handle::{stdin, + stdout, stderr}, due to careless Option handling (and the unwraps are + accompanied by comments assuring you that it won’t panic 🙁). + ManuallyDrop would be a better choice than Option here. Demonstration: + ``std::os::unix::io::IntoRawFd::into_raw_fd(same_file::Handle::stdin().unwrap())``. + Reported at https://github.com/BurntSushi/same-file/issues/55. + + I was going to rate it positive until I found this panic (the first two bads + don’t warrant a downgrade). Fix just that, and I’ll update it to positive. +----- SIGN CREV PROOF ----- +GlyWB-6rwNvY0qOppjx4AcRh1ta-btcpZ6OBaqPUg_Nr5WGCYWiNsfNWYENFHDLZpnPViTt7YaW78oPepAHhCA +----- END CREV PROOF ----- +