Bug in sort 1.1 (Full Version)

All Forums >> [SFU / Interix / SUA Technology] >> Tools Discussion Forum



Message


smallberg -> Bug in sort 1.1 (Aug. 27, '06, 2:00:11 PM)

Running this command
   echo 'b 140\na 141\nc 149' | sort -k2nr

produces this output
   b 140
   c 149
   a 141

instead of the expected
   c 149
   a 141
   b 140

Notice, however, that
   echo 'b 40\na 41\nc 49' | sort -k2nr

produces the correct result
   c 49
   a 41
   b 40

GNU sort 6.1 does not have this bug.




Rodney -> RE: Bug in sort 1.1 (Aug. 27, '06, 3:37:52 PM)

Thanks for the report. I'm looking into it.

A workaround, for most cases (not all), is to separate 'n' from being a modifier to an option.

The bug goes back to the 4.4BSD release. I'm surprised no one has noticed it before.
Once I have a fix I'll pipe it back to the *BSD's.




eperea -> RE: Bug in sort 1.1 (Aug. 27, '06, 3:45:57 PM)

I wouldn't have considered this a bug since it is exactly how the sort in OpenBSD works. The -nrk2 switch does what you want.




Rodney -> RE: Bug in sort 1.1 (Aug. 27, '06, 9:31:49 PM)

There's a difference in the parsing.
To be fully correct the command line should be "-k 2nr" for the example.
Otherwise the intent is unclear if "-nrk2" is equivalent. I'm treating
the report as "-k 2nr".

The "nr" portion, if it is following the "k" is to work as a modifier to just the
specified column ("2" in the above example). When "nr" is coming ahead then it is
treated as an option. The options apply globally (to all columns).

With the above example it's difficult to pull apart the subtle difference between
the column modifier and option results. A more complex example is needed. But key
is that the "nr" is to apply to just column 2. The reverse sorting is not to apply
to column 1.

My first thoughts were "the options were just specified silly" too. However,
a full reading of the man page teases it out. The key part, that's easy to miss:
quote:

The arguments field1 and field2 have the form m.n (m,n > 0) and can be
followed by one or more of the letters b, d, f, i, n, and r, which corre-
spond to the options discussed above.




smallberg -> RE: Bug in sort 1.1 (Aug. 27, '06, 9:43:32 PM)

quote:

I wouldn't have considered this a bug since it is exactly how the sort in OpenBSD works.

We have different ideas about what a bug is, then. Yours appears to be disagreement with an existing popular implementation (OpenBSD). Mine is disagreement with the specified behavior (the man page), enhanced by the fact that it's an inconsistent disagreement (140, 149, 141 vs. 49, 41, 40). Besides, I've been using UNIX sort for almost 30 years, so I know a bug when I see it :-)
quote:

The -nrk2 switch does what you want.

As Rodney noted, it doesn't do what I want in all cases. For example, in
   echo '20\n140 11\n149\n140 2\n140 33' | sort -k1nr,1 -k2,2
(note that the secondary key should not be sorted numerically), the documented behavior (which Solaris sort and GNU sort 6.1 exhibit) yields
   149
   140 11
   140 2
   140 33
   20
I see no way to use the workaround of using an option instead of a modifier letter. For example, this attempt
   echo '20\n140 11\n149\n140 2\n140 33' | sort -nrk1,1 -k2,2
produces
   149
   140 33
   140 11
   140 2
   20
which is not what I want. There is no "nonnumeric" or "nonreverse" option to try.




smallberg -> RE: Bug in sort 1.1 (Aug. 27, '06, 10:12:28 PM)

Rodney,

A quick look at the function named number in the file fields.c in the OpenBSD source for sort reveals a test "if (parity && lastvalue != '0')", where "parity" is the parity of the number of digits in the field. The bug involves that, since after seeing that code I verified that my original "sort -k2nr" example works correctly for 2, 4, and 6-digit numbers, and incorrectly for 3, 5, and 7 digit numbers.

I'm afraid I don't have time to determine a fix...




Rodney -> RE: Bug in sort 1.1 (Aug. 27, '06, 10:30:51 PM)

thanks -- I had notice the pattern. And the bug does center on "0" (zero).
I'm just looking at that code now as you write.
Have to be careful not to break somthing else, so I'm staring at it long and hard [:)]




eperea -> RE: Bug in sort 1.1 (Aug. 27, '06, 11:50:42 PM)

Thank you both for the correction!




Rodney -> RE: Bug in sort 1.1 (Aug. 31, '06, 2:17:50 PM)

The update for sort should come tomorrow.
I'm still testing things to (try to) make sure nothing is getting broken by the fix.




Rodney -> RE: Bug in sort 1.1 (Sep. 3, '06, 7:01:54 AM)

I've made the adjustments to sort.
I've placed version 1.2 in the beta directory for right now.
I keep coming up with "one more test case" to check things are okay
(while doing those other things). So it's in beta for a little while.
The fix was mostly removing several lines of code for which I can't
see any good reason that those lines were there -- and they've been
there since 4.4BSD. So I'm a tad edgy about it, hence beta for now.

You can add it from beta using pkg_add:
pkg_add ftp://ftp.interopsystems.com/pkgs/beta/sort-1.2-bin.tgz
Any feedback is welcomed.




eperea -> RE: Bug in sort 1.1 (Sep. 3, '06, 11:29:14 AM)

I get identical results as in the previous version
Welcome to the Interix UNIX utilities.

DISPLAY=localhost:0.0
$ echo 'b 140\na 141\nc 149' | sort -k2nr
b 140
c 149
a 141
$ pkg_info | grep sort
sort-1.2-bin       Version 1.2 of sort for Interix.
$

Behavior continues to depend on whether the second field has an odd or even number of digits.




Rodney -> RE: Bug in sort 1.1 (Sep. 3, '06, 2:45:46 PM)

That's not the result I get here.
At first I thought I'd bundled the wrong binary, but extracting the package
here I get the correct result.

What do you get for "whence sort"?
What do you get for "cksum `whence sort`"?




eperea -> RE: Bug in sort 1.1 (Sep. 3, '06, 6:19:18 PM)

Could this be due to last night's pkg patch?
artemis:~/JPSoft
{72} % whence sort
whence: Command not found.
artemis:~/JPSoft
{73} % ksh
$ whence sort
/usr/local/bin/sort
$ cksum `whence sort`
3814376779 103936 /usr/local/bin/sort
$




Rodney -> RE: Bug in sort 1.1 (Sep. 3, '06, 8:05:37 PM)

> whence: Command not found.

sorry, for some reason I'd assumed ksh.

> Could this be due to last night's pkg patch?

No.

> 3814376779 103936 /usr/local/bin/sort

I'd moved sort up to /bin with version 1.2 (since it does everything
the "old" ones does and more). But it looks like an previous /Tools version
is still on your system and /usr/local/bin is earlier in PATH.

The solution is to do "rm /usr/local/bin/sort".
Then /bin/sort (1.2) will be found.
When 1.1 was removed the /usr/local/bin/sort would have been removed. If
there was another copy that had been saved (aka sort-0) then they may be
where it comes from. But it's certainly not needed now.
Though where the sort-0 came from I dunno.




eperea -> RE: Bug in sort 1.1 (Sep. 4, '06, 1:14:43 AM)

Thanks, I should have tought of that! Naturally, it works.

It was logical to assume ksh, since smallberg's test syntax requires it and therefore I had used it in my last post. I should have added a smiley after the "Command not found", message, since the joke was on me once again. I prefer tcsh myself, and forgot that for the purposes of this thread, ksh is standard. [:)]




Rodney -> RE: Bug in sort 1.1 (Sep. 4, '06, 2:32:31 AM)

> it works.

great!

> I prefer tcsh myself, and forgot that for the purposes of this thread, ksh is standard

I live in tcsh [:)]. The ksh I can function with, but I'm not keen on it (I don't
mean this particular one, I mean all of them). When I got tcsh into Interix (OpenNT at the
time) it was amazing how many users were really, really happy about this; it surprised management
at the time (who were all ksh users). It was similar with bash and zsh once available.

Anyway, my point is when there's a problem it's usually best to work with whatever shell you
(or anyone else) are using because there just might be some nuance of that shell that's involved.
Builtin's are a prime example of this. Thus, while ksh is the "standard shell" we're not holding
users to this when doing posts or whatever -- just do what you do [:)]




Page: [1]



Forum Software © ASPPlayground.NET Advanced Edition 2.5 ANSI

0.031