Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes! Finally! Let's treat filenames with new lines as errors! I'm so delighted with this decision.


The original request was to ban all bytes between 1 and 31.

https://www.austingroupbugs.net/view.php?id=251

At some point they decided to narrow the change to just ban the newline character.

Which I personally think is a pity. Allowing escape in file names is a security risk because it enables you to embed ECMA-48 escape sequences in file names. Secure terminal emulators shouldn’t be made vulnerable by arbitrary escape sequences, but there are “too smart for their own good” terminal emulators out there that have escape sequences that let you do crazy things like run arbitrary executables.


There are many non-UTF-8/16/32 character encoding used in the wild which use these value in multi-byte character encoding. These values are used in the wild.

I think the decision forbidding newline in pathname is also wrong. It may break tons of existing code.


I wish Linux/etc had a mount option and/or superblock flag called “allow only sane file names”. And if you had that set, then attempting to create a file whose name wasn’t valid UTF-8, or which contained C0 or C1 controls, would fail. The small minority of people who really need pre-Unicode encodings such as ISO 2022 could just not turn that option on. And the majority who don’t need anything like that could reap the benefits of eliminating a whole category of potential bugs and vulnerabilities.


> There are many non-UTF-8/16/32 character encoding used in the wild which use these value in multi-byte character encoding.

Like what? I am genuinely curious: Shift-JIS, GB2312, Big5, and all of the EUC variants do not use bytes that correspond to C0 characters in ASCII.


That's obviously impossible since it would break backward compatibility and the users' existing filesystems (and the Linux kernel will rightly never accept anything like that).

The only reasonable fix is to enhance bash and shell IDEs to track for each variable whether it could possibly include all filename-valid characters (e.g. if it comes from read with no options then it can't contain \n) and warn (off by default unless stderr is a terminal) if they can't and it's used as a filename (conservatively determined when used as arguments to processes), and also warn when using find without -print0, etc. noninteractively and perhaps interactively as well.


Why is that an issue?


Run a program to list a directory. Everything that interfaces with that, will assume newline delimiters. Similar assumptions are baked into a lot of software.

Enforcing that a newline isn't part of a path, ensures the security of those systems that are commonly relied on.


Except no one's enforcing anything yet. Earlier versions of POSIX allowed rejecting filenames containing newlines, the newest version encourages it while mandating features required to handle such filenames safely (find -print0, xargs -0, read -d ''). So nothing's set in stone yet.


> Everything that interfaces with that, will assume newline delimiters.

Well, only badly written programs. nushell handles this fine, as will any program that doesn't try to do everything as plain strings:

  ~> touch "foo\nbar"
  ~> ls foo* | print
  ╭───┬──────┬──────┬──────┬──────────╮
  │ # │ name │ type │ size │ modified │
  ├───┼──────┼──────┼──────┼──────────┤
  │ 0 │ foo  │ file │  0 B │ now      │
  │   │ bar  │      │      │          │
  ╰───┴──────┴──────┴──────┴──────────╯
However after reading it they're only making them illegal for the posix utilities from the 70s that aren't written properly, so I think that makes sense.


Next: spaces


Still much better than mojibaked names.


What do you mean?


What is the encoding of the filenames?


I am personally not aware of any MBCS that could have a 0x20 or 0x0D as a valid trailing byte. Are you?


I think my comment correctly contrasted mojibake from new lines or spaces for that reason.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: