Home My Page Projects Code Snippets Project Openings SML/NJ Bugs
Summary Activity Tracker Lists

[#88] Support for UTF8 path names

Date:
2012-02-02 12:15
Priority:
3
State:
Closed
Submitted by:
Bug Submitter (webuser)
Assigned to:
John Reppy (jhr)
Machine Architecture:
None
Operating System:
Other
Component:
Basis Library
Resolution:
Fixed
Severity:
Major
OS Version:
Tested on OS/X Lion and Windows XP (using the Cygwin-compiled compiler)
SML/NJ Version:
110.74
Keywords:
UTF8, OS.Path
URL:
Transcript (of reproduction):
- OS.FileSys.openDir /Users/michael; val it = DS {dirStrm=-,isOpen=ref true} : ?.OS_FileSys.dirstream - val d = it; val d = DS {dirStrm=-,isOpen=ref true} : ?.OS_FileSys.dirstream - OS.FileSys.readDir d; val it = SOME .abbot.properties : string option and so on, until - OS.FileSys.readDir d; val it = SOME \95\134\195\152A\204\138 : string option - val s = Option.valOf it; val s = \95\134\195\152A\204\138 : string - val m = String.concat [/Users/michael/, s]; - OS.FileSys.openDir m; val it = DS {dirStrm=-,isOpen=ref true} : ?.OS_FileSys.dirstream - OS.Path.concat (m, abc); uncaught exception InvalidArc raised at: Basis/Implementation/OS/os-path-fn.sml:47.62-47.72 - OS.Path.concat (���, abc); uncaught exception InvalidArc raised at: Basis/Implementation/OS/os-path-fn.sml:47.62-47.72
Source (for reproduction):
For example, if I have a directory ��� and I do OS.FileSys.openDir ��� I get no problems, whereas OS.Path.concat (���, abc) I do.
Summary:
Support for UTF8 path names

Detailed description
Some of the OS.Path functions use the checkArc function to check the validity of arcs of paths. This does not play well with UTF8-encoded paths (which otherwise more or less work).

For example, if I have a directory and I do
OS.FileSys.openDir
I get no problems, whereas
OS.Path.concat (, abc)
I do.
Additional comments:
I see that validArc in Basis/Implementation/Unix/os-path.sml uses Char.isPrint, which is strictly speaking not correct for Unix paths (which allow anything but / and \0), and also the root of the problem. The Windows implementation in Basis/Implementation/Win32/os-path.sml seems more sane.

Submitted via web form by Michael Westergaard m.westergaard@tue.nl

Comments:

Message  ↓
Date: 2012-02-05 12:58
Sender: John Reppy

Fixed for 110.75. We now accept any character except slash or null.

Attached Files:

Changes

Field Old Value Date By
status_idOpen2012-02-05 12:58jhr
assigned_tonone2012-02-05 12:58jhr
close_date2012-02-05 12:582012-02-05 12:58jhr
ResolutionNone2012-02-05 12:58jhr