X-Git-Url: https://git.chrismorgan.info/crev-proofs/blobdiff_plain/792e69a0e48e99c08b83c491da1425cea99e0a99..HEAD:/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev

diff --git a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev
index 15b5620..9a0bca2 100644
--- a/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev
+++ b/QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU/reviews/2022-01-package-jqCnIw.proof.crev
@@ -70,3 +70,468 @@ comment: |-
 ymZLNHdzGamyaPo5JyBuSNUY398pyzpjsFMsup1knXtscSZumHP3ZgzNC8KPO7VCQ-23XnQxapenIX27x_DdDw
 ----- END CREV PROOF -----
 
+----- BEGIN CREV PROOF -----
+kind: package review
+version: -1
+date: "2022-01-08T02:07:39.869637389+11:00"
+from:
+  id-type: crev
+  id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU
+  url: "https://git.chrismorgan.info/crev-proofs"
+package:
+  source: "https://crates.io"
+  name: human-string-filler
+  version: 1.0.0
+  revision: a02907505ed1b65bb08d6a77c3b9676ab6c07d02
+  digest: ZfyKwrp12UPlfOiQPJ9qhbKYeoXy5Il9oJgL_ePwUlA
+review:
+  thoroughness: high
+  understanding: high
+  rating: strong
+comment: |-
+  I wrote this crate, carefully. Itâs robust, well documented, well tested, and
+  entirely devoid of panics.
+----- SIGN CREV PROOF -----
+uGwLNFZ_p6EV3AXkJYjJle9N4joGx7LkwEMkdONuNWlolx0UWuz9GVM9IqQrBbmdMorpDYkYSPW1wDxJJBuaAA
+----- END CREV PROOF -----
+
+----- BEGIN CREV PROOF -----
+kind: package review
+version: -1
+date: "2022-01-11T20:07:22.213361263+11:00"
+from:
+  id-type: crev
+  id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU
+  url: "https://git.chrismorgan.info/crev-proofs"
+package:
+  source: "https://crates.io"
+  name: sanitize-filename
+  version: 0.3.0
+  revision: ece3260eb1f7b959125e5ca8f25781b31e875934
+  digest: WpQbem5BHnp6FnMIwNny-oY5xqc_7W0x4anVu2N-gnw
+review:
+  thoroughness: high
+  understanding: high
+  rating: neutral
+alternatives:
+  - source: "https://crates.io"
+    name: sanitise-file-name
+comment: |-
+  This crate is straightforward and simple in a fairly good way, but itâs not
+  as efficient as it could be, has a couple of small but significant bugs,
+  has some defaults and behaviours that I reckon arenât ideal.
+  
+  The good:
+  
+  â¢ The code is simple. The actual sanitisation code is 27 lines
+    (not including the regular expression definitions).
+  
+  â¢ If nothing needs to be changed about the name (which is probably the most
+    common case), exactly one precisely-sized allocation is made. (This is
+    definitely a positive side-effect of using Regex::replace_all rather than
+    the obvious standard library alternative, str::replace, which would have
+    led to a minimum of three or five allocations.)
+  
+  The not-so-good (that is, things that arenât ideal but arenât real problems):
+  
+  â¢ Unnecessary dependencies: regex and lazy_static are used, but I reckon
+    theyâre not warranted. A character iterator and some push(char) or
+    push_str(replacement), plus some trim_end_matches() at the end in lieu of
+    WINDOWS_TRAILING_RE, would do the same job probably fasterâthough I will
+    admit that such a naive implementation would be prone to allocate slightly
+    more than this does when no replacements are made (though starting with
+    `String::with_capacity(name.len())` would nigh completely counter this),
+    and a smarter implementation that deferred allocation would be a bit longer
+    and not quite so âobviously correctâ.
+  
+    Mitigating this claim of unnecessary dependencies and heaviness, regexâs
+    default features have been turned off, which is good.
+  
+  â¢ It strips more than is necessary: U+0080âU+009F are not reserved by any
+    file system or operating system I know of, but this crate is stripping
+    them; and the names COM0 and LPT0 are not reserved by Windows, but this
+    crate is stripping them.
+  
+  â¢ Poor style on Regex and RegexBuilder: ILLEGAL_RE includes colon twice;
+    and WINDOWS_RESERVED_RE sets case insensitivity in two ways, `(?i)` and
+    `RegexBuilder::case_insensitive(true)` (so it could have just used
+    `Regex::new`).
+  
+  â¢ It unnecessarily enables the unicode-case feature (which is heavy) on regex:
+    In its favour it did disable the default features, but unicode-case isnât
+    needed, because WINDOWS_RESERVED_RE only matches ASCII. It would be better
+    to change (?i) to (?i-u) and disable the feature again.
+  
+  â¢ The defaults depend on your platform: the default value for
+    Options::windows is true on Windows and false elsewhere; varying things
+    like this is risky, and I think a poor choice in this particular case,
+    because you could be writing to a device subject to Windowsâ limitations
+    (e.g. a local NTFS partition, a FAT32 or exFAT USB disk, a SMB network
+    share), and the difference in what Windows requires is so slight (assuming
+    the excessive stripping mentioned in the next point).
+  
+  â¢ With Options::windows false, it strips much more than is necessary:
+    ILLEGAL_RE is still applied, and it corresponds to a mix of NTFS/FAT32 and
+    Windows limitations; other platforms pretty much only need to worry about â
+    (0x00) and / (0x2F).
+  
+  â¢ Truncation operates on the entire name, not the base name: this is a
+    flaw found in almost all libraries like this; although itâs not
+    intrinsically a security vulnerability, it contributes to the critical
+    security vulnerability where you thought you checked the extension of a
+    file was OK, but then you sanitised it and thte extension changed. But
+    quite apart from the security aspect, file extensions changing unexpectedly
+    is a usability hazard.
+  
+  â¢ Documentation-wise, itâs kind of a black box: the README is fair, but
+    doesnât explain in detail any of whatâs done, and there are no doc comments
+    on anything. When it turns "aux.h" into "", youâll say, âwait, *what*!?â if
+    you donât know the deal.
+  
+  The bad (that is, itâll cause actual problems):
+  
+  â¢ It misses a character that Microsoft says isnât allowed: U+007F is supposed
+    to be stripped, but isnât.
+  
+  â¢ WINDOWS_TRAILING_RE is wrong: it should be stripping trailing spaces and
+    dots (`[. ]+$`), but is instead a duplicate of RESERVED_RE (`^\.+$`) and so
+    does nothing.
+  
+  My opinions:
+  
+  â¢ Options::truncate shouldnât be a bool with the number 255 hard-coded,
+    because itâs common to want to append an extension afterwards (especially
+    if you know about the truncation hazard!).
+  
+  â¢ I think "" is a mildly poor default for Options::replacement; I think "_"
+    would be a better choice, so that â1 < 0? A treatiseâ
+    becomes â1 _ 0_ A treatiseâ rather than â1  0 A treatiseâ.
+  
+  â¢ The treatment of reserved names is poor: theyâre entirely replaced with the
+    replacement string, so by default "aux.h" will become "" (eh, I suppose
+    thatâs not too awful), and with an underscore "aux.h" will become "_" (uh
+    oh, it lost its extension). I think a better choice is to alter the base
+    name; Iâve seen a double underscore prefix and an underscore prefix, and I
+    chose an underscore suffix in my crate.
+  
+  â¢ There should also be an option to remove leading dots, which hide files on
+    most platforms, and are fairly problematic under Windows.
+  
+  â¢ âSanitiseâ looks so much better than âsanitizeâ. ð
+  
+  My recommendation: use my sanitise-file-name crate instead of this.
+  My crateâs code is certainly longer-winded because of its performance focus,
+  but itâs more thorough in its processing, with better filtering and better
+  defaults, quite apart from the (unbenchmarked) performance thing.
+  
+  sanitise-file-name doesnât cover *precisely* the same functionality as this
+  crate, but the differences are generally minor and fairly clearly described.
+  Here are the option translations:
+  
+    sanitize_filename::Options.truncate == true (the default)
+    sanitise_file_name::Options.length_limit == 255 (the default)
+  
+    sanitize_filename::Options.truncate == false
+    sanitise_file_name::Options.length_limit == usize::MAX
+  
+    sanitize_filename::Options.windows (default cfg!(windows))
+    sanitise_file_name::Options.windows_safe (default true)
+  
+    sanitize_filename::Options.replacement == "" (the default)
+    sanitise_file_name::Options.replace_with == None (NOT the default)
+  
+    sanitize_filename::Options.replacement == "_" (any one character)
+    sanitise_file_name::Options.replace_with == Some('_') (the default)
+  
+    sanitize_filename::Options.replacement == "string" (longer than one character)
+    (not supported in sanitise-file-name)
+  
+  I havenât factored the sanitize-filename binary into my reckoning here.
+  Some may find it useful, but I donât think it will often be useful as a
+  standalone binary. And the binaryâs usage string is found in the README, but
+  is not emitted by the binary itself, which is odd. Kinda fits in with not
+  documenting anything inside the code files!
+----- SIGN CREV PROOF -----
+16uMNqMzp_nrGOi5JyJro5meslEjUV7coVZ5qxyobQ6So-H8DrkbjBSOntYC-1fP2VthjzzdfvvB__i6KakhCw
+----- END CREV PROOF -----
+
+----- BEGIN CREV PROOF -----
+kind: package review
+version: -1
+date: "2022-01-12T05:37:47.367354687+11:00"
+from:
+  id-type: crev
+  id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU
+  url: "https://git.chrismorgan.info/crev-proofs"
+package:
+  source: "https://crates.io"
+  name: sanitize-filename-reader-friendly
+  version: 2.2.1
+  revision: 245eca36c588cb70f907e2167ecee93b0fd54d33
+  digest: f6Pot3fE2Kp4y5hrg6gx34D4YFt6oJGexGqMNwT2yko
+review:
+  thoroughness: high
+  understanding: high
+  rating: negative
+alternatives:
+  - source: "https://crates.io"
+    name: sanitise-file-name
+comment: |-
+  This crate does approximately what itâs supposed to, but itâs overengineered
+  in a bad way (complicated for no good reason), confusingly and insufficiently
+  documented, contains some perplexing functionality that I canât think of any
+  good reason for the existence of, is not as efficient as it could be, and
+  misses a couple of fairly significant things in a file name sanitiser that
+  mean that what it returns might still not work.
+  
+  I recommend my own crate sanitise-file-name instead, which is unfortunately
+  at least as complex, but for the good reasons of runtime efficiency and it
+  actually caring about extensions and correctness, and far better documented
+  and explained.
+  
+  The good:
+  
+  â¢ It does what it says on the tin. The sanitisation it performs *is* fairly
+    reader-friendly (though its rules are more convoluted than I think is
+    warranted).
+  
+  The not-so-good:
+  
+  â¢ Thereâs no name length limitation: this is actually a little subjective,
+    because implemented and used carelessly it leads to file extension change
+    attacks (which in the right environment could allow remote code execution),
+    but itâs generally expected.
+  
+  â¢ It strips or replaces some characters it probably shouldnât:
+  
+    â¢ For example, U+200B is branded a non-printing character and removed. For
+      starters, thereâs no such concept in Unicode; the closest is control
+      characters, and U+200B ZERO-WIDTH SPACE isnât one of those. Removing it
+      serves no security purpose (unlike most of the Bidi_Control characters),
+      and will mangle some Unicode text where ZWSP is significant. You could
+      argue over it being a fingerprinting vector, but in that case why arenât
+      ZWNJ and ZWJ filtered out, and after that how about confusables, &c.
+      So I say thereâs just no good reason to remove U+200B.
+  
+    â¢ It replaces various characters that no file system objects to, such as ~
+      (replaced with _) and #, %, {, }, ^ and ` (replaced with space).
+      Most of those can be genuinely useful characters in file names.
+  
+  â¢ Some of the rules are fairly arbitrary, and some excessive: the decisions
+    about what gets turned into a space, what gets turned into an underscore,
+    and what gets removed are idiosyncratic with no explanation, and a couple
+    of them seem a little odd. Not harmful in any way, justâ¦ why those
+    particular rules? FILTER_ORIG_AFTER_LAST_PROCESSED_WAS_WHITESPACE is
+    perhaps the most bizarre and troubling: drop a few characters, including
+    `.`, the extension separator, if they come after a space. Remember my
+    earlier point about length limitation potentially opening a security
+    vulnerability? Well, this is a similar sort of issue, though unlikely to be
+    exploitable, that you can change the extension significantly:
+    `foo.exe .jpg` becomes `foo.exe jpg`.
+  
+    I think itâs telling, philosophically, that the library completely ignores
+    extensions. Itâs not worrying about security. (In the tests, only two or
+    three test cases have extensions, and those are ones copied from elsewhere.)
+  
+  The bad:
+  
+  â¢ Itâs inefficient: an absolute minimum of five allocations, and in practice
+    itâll tend to be a few more. (It wouldnât be *terribly* difficult to make
+    it perform exactly one allocation in all cases, like I did. A couple of the
+    allocations are almost trivially removable.) Also it uses str.find(char) to
+    check if characters can be removed, on strings as long as 10 chars/30 code
+    units, which is much less efficient than a match block and probably less
+    efficient even than using [char], while also writing boolean conditions in
+    a way that doesnât take advantage of short-circuiting
+    (`STRING.find(c_orig).is_some() && last_processed_chr == '&'`, which would
+    be better flipped to do the cheap char comparison first).
+  
+  â¢ Itâs confusing with some surprising scope: itâs not sanitising simple
+    names, but rather some kind of thing where you can put parts on multiple
+    lines and itâll sanitise them individually and join them up and I canât
+    think of any serious reason why youâd actually want thisâthe potential
+    benefits are negligible, and it makes it much more confusing.
+  
+  â¢ Its code style is inconsistent: useful comments occasionally, superfluous
+    comments sometimes but not other times, erroneous comments sometimes, a doc
+    comment with a spurious asterisk in the middle.
+  
+  â¢ It overuses constants and its documentation is poorly prepared: itâs
+    written in an idiosyncratic private const-heavy style that makes reading
+    the code hard and doesnât translate to rustdoc at all, so you have to read
+    the code to understand whatâs actually going on, and the codeâs overly
+    complicated, so it takes quite some time to figure out whatâs going on,
+    constantly having to jump between the code and the consts as well.
+    Actually, it looks like the author *tried* to make it rustdoc-friendly by
+    pulling in const_format::concatcp! (clever compile-time magic) and exposing
+    a couple of public consts, but quite apart from them being incomplete (as
+    documented), it doesnât even work because rustdoc doesnât include the
+    values of the consts in its output. (In theory perhaps the constants could
+    be used by dependers as well, but I canât imagine any genuine use case.)
+  
+  â¢ It doesnât do all it needs to: Windows reserved names are a TODO in the
+    code. I contributed the essence of a patch, but no action has been taken
+    after some weeks. (Actually, I started patching it further to make it more
+    efficient, then decided I just wanted to start from scratch, and so I made
+    sanitise-file-name.)
+  
+  â¢ I have a problem with its entire philosphical approach, given the domain:
+    as noted in the previous section, length limitations, not even
+    acknowledging extensions are a thing and doing dangerous things in
+    consequence, itâs just not the right approach for sanitising file names, an
+    area where you should be caring about security. Ultimately, this is why Iâm
+    giving a rating of negative rather than neutral as I originally intended.
+  
+  My recommendation: avoid this crate. Itâs not awful by any means, but itâs
+  not well-written either, and my crate sanitise-file-name provides much the
+  same functionality, but better. Its default options are quite similar to this
+  crateâs, but more thorough. It doesnât offer the strangne multi-line thing
+  this crate does.
+----- SIGN CREV PROOF -----
+C30SLGF6ibx4QaLXo6uazHuIio1iP_rSmab7Vo6nLEeFZfuoaFTqiaMhPYflhvcGHEu6jImRBBIvmworq13bBg
+----- END CREV PROOF -----
+
+----- BEGIN CREV PROOF -----
+kind: package review
+version: -1
+date: "2022-01-13T00:42:12.174512813+11:00"
+from:
+  id-type: crev
+  id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU
+  url: "https://git.chrismorgan.info/crev-proofs"
+package:
+  source: "https://crates.io"
+  name: cfg-if
+  version: 1.0.0
+  revision: e60fa1efeab0ec6e90c50d93ec526e1410459c23
+  digest: YmRVto2mw7U7GJhrOe13k_5Sgg9TcQWK1NuLN2Qiz5E
+review:
+  thoroughness: high
+  understanding: high
+  rating: positive
+comment: |-
+  This is a robust crate, clear, simple and well-documented within and without,
+  powered by a well-designed tt gobbler, and written by Alex Crichton of
+  exemplar reputation.
+  
+  Iâm quite fond of the ``$(if â¦ { â¦ })else* else { â¦ }`` trick, using the
+  keyword ``else`` as the repetition delimiter. I knew you could put a variety
+  of things there, not just characters like comma or semicolon, but I hadnât
+  connected the dots that that would extended to keywords. Hey, you can even
+  use *string literals*! Crazy stuff. Anyway, thatâs a neat way of keeping the
+  public API to two rules rather than three as I would probably have done.
+  
+  I found one surprise in this: ``cfg_if! { else { true } }`` works, yielding
+  true, because that repetition I just showed used ``*`` instead of ``+``.
+  Itâs not covered by the test suite, but itâs harmless.
+  <https://github.com/alexcrichton/cfg-if/issues/23> confirms itâs unintended.
+  
+  Since the 1.0.0 release, there have been [a few commits
+  <https://github.com/alexcrichton/cfg-if/commits/main>]. Most are minor
+  trivialities (yeah, Iâll even call Dependabot that, though I hate it and itâs
+  not even useful here because this crate has no dependencies), but
+  <https://github.com/alexcrichton/cfg-if/commit/349913689695130c2c63d1075e57b0917c98ee6e>
+  is significant and very mildly concerning: as well as twiddling some
+  formatting (for the good), it changes that ``*`` to ``+`` to deny bare else,
+  which is a breaking change, for unlikely but reasonable and plausible code.
+  Iâve commented on #23 and #39 to this effect; weâll see what comes of it.
+  (It could be reverted, it could be ignored, it could be released as 2.0.0.)
+  
+  There are a couple of extraneous files in the package (.gitignore,
+  .github/workflows/main.yml), but theyâre small enough that I wonât complain.
+  
+  I think this crate is overused and most places itâs used would be better
+  served by writing the negations themselves, which are often no longer and
+  often clearer.
+----- SIGN CREV PROOF -----
+ex_TABmVr_CTEY6B5C-NDKk7ciKdCK1bLf6ltjpXyoVknvyAC2AnRjA5rlPX_5ORQMeHmzJ2UxWUR0RdgBjYAQ
+----- END CREV PROOF -----
+
+----- BEGIN CREV PROOF -----
+kind: package review
+version: -1
+date: "2022-01-14T02:22:52.039384447+11:00"
+from:
+  id-type: crev
+  id: QE5OVlHZ4QyOcMqdjXhS1MgsoZHvUqxOHNZwyfpsDIU
+  url: "https://git.chrismorgan.info/crev-proofs"
+package:
+  source: "https://crates.io"
+  name: same-file
+  version: 1.0.6
+  revision: 5799cd323b8eefd17a089c950dac113f66c89c9e
+  digest: Yf7qyYw3W3WoHK4t74epRoJkZehfhWh-Hh-R0LfB8UQ
+review:
+  thoroughness: high
+  understanding: high
+  rating: neutral
+comment: |-
+  The good:
+  
+  â Decently tested, including CI coverage of the declared MSRV (1.34.0). (As
+    noted in src/lib.rs lines 485â491, edge case tests are difficult to
+    synthesise; Andrew doesnât know how to create those environmental
+    conditions and I donât really either. I suppose itâd entail reading file
+    system specs and creating suitable file systems from scratch, and mounting
+    themânot terribly difficult, but tedious work for limited gain.)
+  
+  â The crate only supports its *functionality* on cfg(any(unix, windows)), but
+    still provides a sane fallback on cfg(not(any(unix, windows))) so that all
+    methods returns an explanatory I/O error.
+  
+  â Itâs unfortunate that false positives can arise, but given that itâs
+    unavoidable, it is pleasingly well-document.
+  
+  â Cargo.toml excludes CI files from the package (keep things small!)
+  
+  The neutral:
+  
+  â¢ The I/O error returned on unknown platforms is of kind Other. I look
+    forward to an MSRV bump to 1.53.0 so that Unsupported can be used.
+  
+  â¢ The code has some cfg(target_os = "redox") that can be removed when the
+    MSRV reaches Rust 1.38.0 (thatâs when redox was moved to the unix family,
+    in <https://github.com/rust-lang/rust/pull/60547>).
+  
+  â¢ Andrew uses his preferred Unlicense/MIT combo; [I have problems with
+    Unlicense <https://chrismorgan.info/blog/unlicense/>].
+  
+  â¢ src/win.rs lines 17â32 acknowledge a straightforwardly-remediable gap that,
+    if resolved, would by the sound of it eliminate false positives on at least
+    the Windows Server + ReFS combo. This is something that someone that cared
+    could contribute fairly easily, with access to Windows Server 2012 or
+    later. (Microsoftâs documentation of FILE_ID_INFO *says* itâs Server-only,
+    but given that ReFS is supported from Windows 8.1 onwards as well, I wonder
+    if it might be *de facto* available elsewhere too.)
+  
+  â¢ Handle::{stdin, stdout, stderr} bypass std::io::{Stdin, Stdout, Stderr}
+    locking. I donât much like this, but given that avoiding it would be very
+    painful to the API, and that locking isnât a memory-safety matter, already
+    not being inviolable due to foreign code (though the std::io docs donât
+    seem to mention that), so I think this was the pragmatic choice.
+  
+  The bad:
+  
+  â The file size comparison that is documented in two places as being there on
+    Windows as a false positive avoidance heuristic is accidentally missing.
+    This was reported and acknowledged as an error on 2021-05-19 in
+    <https://github.com/BurntSushi/same-file/issues/52>, but it still hasnât
+    been reinstated or the comments removed.
+  
+  â Handle::{dev, ino} are Unix-specific methods; it would be better not to
+    expose them on the type itself, but to move them to a platform-specific
+    extension trait to make it more explicit, e.g. HandleUnixExt.
+  
+  â std::os::unix::io::IntoRawFd implementation panics on Handle::{stdin,
+    stdout, stderr}, due to careless Option handling (and the unwraps are
+    accompanied by comments assuring you that it wonât panic ð).
+    ManuallyDrop would be a better choice than Option here. Demonstration:
+    ``std::os::unix::io::IntoRawFd::into_raw_fd(same_file::Handle::stdin().unwrap())``.
+    Reported at https://github.com/BurntSushi/same-file/issues/55.
+  
+  I was going to rate it positive until I found this panic (the first two bads
+  donât warrant a downgrade). Fix just that, and Iâll update it to positive.
+----- SIGN CREV PROOF -----
+GlyWB-6rwNvY0qOppjx4AcRh1ta-btcpZ6OBaqPUg_Nr5WGCYWiNsfNWYENFHDLZpnPViTt7YaW78oPepAHhCA
+----- END CREV PROOF -----
+