Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Science and religion are in full accord but science and faith are in complete discord.


devel / comp.unix.shell / Re: The size of pipes (Was: sort by multiple columns)

SubjectAuthor
* sort by multiple columnsMartin Τrautmann
+* sort by multiple columnsSpiros Bousbouras
|`- sort by multiple columnsMartin Τrautmann
+* sort by multiple columnsJanis Papanagnou
|`* sort by multiple columnsJanis Papanagnou
| `- sort by multiple columnsMartin Τrautmann
+* sort by multiple columnsHelmut Waitzmann
|`* sort by multiple columnsMartin Τrautmann
| `* sort by multiple columnsHelmut Waitzmann
|  `* sort by multiple columnsMartin Τrautmann
|   +* sort by multiple columnsHelmut Waitzmann
|   |`* sort by multiple columnsHelmut Waitzmann
|   | `* sort by multiple columnsMartin Τrautmann
|   |  `* sort by multiple columnsMartin Τrautmann
|   |   `* sort by multiple columnsHelmut Waitzmann
|   |    +* sort by multiple columnsBen Bacarisse
|   |    |`- sort by multiple columnsHelmut Waitzmann
|   |    `* sort by multiple columnsMartin Τrautmann
|   |     +* sort by multiple columnsLew Pitcher
|   |     |`* sort by multiple columnsMartin Τrautmann
|   |     | `* sort by multiple columnsKeith Thompson
|   |     |  `* sort by multiple columnsMartin Τrautmann
|   |     |   +- sort by multiple columnsSpiros Bousbouras
|   |     |   `* sort by multiple columnsKeith Thompson
|   |     |    `- sort by multiple columnsMartin Τrautmann
|   |     `- sort by multiple columnsKaz Kylheku
|   +* sort by multiple columnsDavid W. Hodgins
|   |+* The size of pipes (Was: sort by multiple columns)Kenny McCormack
|   ||+* The size of pipesFelix Palmen
|   |||+* The size of pipesJanis Papanagnou
|   ||||`* The size of pipesFelix Palmen
|   |||| +* The size of pipesDavid W. Hodgins
|   |||| |`- The size of pipesJanis Papanagnou
|   |||| `* The size of pipesJanis Papanagnou
|   ||||  +- The size of pipesSpiros Bousbouras
|   ||||  `* The size of pipesFelix Palmen
|   ||||   `- The size of pipesJanis Papanagnou
|   |||`- The size of pipesDavid W. Hodgins
|   ||+- The size of pipes (Was: sort by multiple columns)David W. Hodgins
|   ||+* The size of pipes (Was: sort by multiple columns)John-Paul Stewart
|   |||+* The size of pipes (Was: sort by multiple columns)David W. Hodgins
|   ||||`* The size of pipes (Was: sort by multiple columns)Kaz Kylheku
|   |||| `- The size of pipesFelix Palmen
|   |||+- The size of pipes (Was: sort by multiple columns)Lew Pitcher
|   |||`* The size of pipes (Was: sort by multiple columns)vallor
|   ||| `* The size of pipes (Was: sort by multiple columns)Janis Papanagnou
|   |||  +* The size of pipes (Was: sort by multiple columns)Geoff Clare
|   |||  |`* The size of pipes (Was: sort by multiple columns)Kenny McCormack
|   |||  | `* The size of pipes (Was: sort by multiple columns)David W. Hodgins
|   |||  |  `- The size of pipes (Was: sort by multiple columns)Geoff Clare
|   |||  `- The size of pipes (Was: sort by multiple columns)Eric Pozharski
|   ||+* The size of pipes (Was: sort by multiple columns)Janis Papanagnou
|   |||`* The size of pipes (Was: sort by multiple columns)Kenny McCormack
|   ||| `- The size of pipes (Was: sort by multiple columns)Kaz Kylheku
|   ||`* The size of pipes (Was: sort by multiple columns)Spiros Bousbouras
|   || +- The size of pipes (Was: sort by multiple columns)Spiros Bousbouras
|   || `* The size of pipes (Was: sort by multiple columns)Janis Papanagnou
|   ||  `* The size of pipes (Was: sort by multiple columns)Richard Harnden
|   ||   `- The size of pipes (Was: sort by multiple columns)Janis Papanagnou
|   |`* sort by multiple columnsMartin Τrautmann
|   | +* sort by multiple columnsChris Elvidge
|   | |`* sort by multiple columnsMartin Τrautmann
|   | | `* sort by multiple columnsRichard Harnden
|   | |  `* sort by multiple columnsMartin Τrautmann
|   | |   +* sort by multiple columnsLew Pitcher
|   | |   |`- sort by multiple columnsMartin Τrautmann
|   | |   `- sort by multiple columnsDavid W. Hodgins
|   | `- sort by multiple columnsHelmut Waitzmann
|   `- sort by multiple columnsJanis Papanagnou
+* sort by multiple columnsDr Eberhard W Lisse
|`* sort by multiple columnsMartin Τrautmann
| `* sort by multiple columnsKenny McCormack
|  `* sort by multiple columnsMartin Τrautmann
|   +* Miller (Was: sort by multiple columns)Kenny McCormack
|   |+- Miller (Was: sort by multiple columns)Martin Τrautmann
|   |`- Miller (Was: sort by multiple columns)Dr Eberhard W Lisse
|   +* sort by multiple columnsgerg
|   |`- sort by multiple columnsDr Eberhard W Lisse
|   `- sort by multiple columnsDr Eberhard W Lisse
+* sort by multiple columnsPopping Mad
|`* sort by multiple columnsMartin Τrautmann
| +* sort by multiple columnsKaz Kylheku
| |`* sort by multiple columnsMartin Τrautmann
| | `* Other tools (Was: sort by multiple columns)Kenny McCormack
| |  `* Other tools (Was: sort by multiple columns)Martin Τrautmann
| |   `* Other tools (Was: sort by multiple columns)Chris Elvidge
| |    +* Other tools (Was: sort by multiple columns)Janis Papanagnou
| |    |`* Other tools (Was: sort by multiple columns)Kenny McCormack
| |    | +- Other tools (Was: sort by multiple columns)Janis Papanagnou
| |    | `* Other tools (Was: sort by multiple columns)Kaz Kylheku
| |    |  `- Other toolsKeith Thompson
| |    `- Other toolsKeith Thompson
| `- sort by multiple columnsKenny McCormack
`* sort by multiple columnsBenjamin Esham
 `* sort by multiple columnsMartin Τrautmann
  `* sort by multiple columnsBenjamin Esham
   `* sort by multiple columnsMartin Τrautmann
    `* sort by multiple columnsJanis Papanagnou
     +* sort by multiple columnsDavid W. Hodgins
     |`* sort by multiple columnsMartin Τrautmann
     | `- sort by multiple columnsDavid W. Hodgins
     `- sort by multiple columnsBenjamin Esham

Pages:12345
Re: The size of pipes (Was: sort by multiple columns)

<u23ito$2osbe$1@news.xmission.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6962&group=comp.unix.shell#6962

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.imp.ch!usenet.csail.mit.edu!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Sun, 23 Apr 2023 15:30:00 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <u23ito$2osbe$1@news.xmission.com>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com> <u23gql$3rkl5$1@dont-email.me>
Injection-Date: Sun, 23 Apr 2023 15:30:00 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="2912622"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Sun, 23 Apr 2023 15:30 UTC

In article <u23gql$3rkl5$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>On 23.04.2023 16:36, Kenny McCormack wrote:
>> [...]
>>
>> For most programs, this is rarely a concern, since most pipelines write and
>> read more or less simultaneously in real time, but sort is an edge case for
>> the reason you explain above.
>
>Note also that there are quite some sorting operations inherently
>used (e.g. in 'ls', in shells '*' glob/pattern expansion, etc.).
>For example, don't expect find | xargs ls to provide a sorted
>output.
>
>>
>> Something to keep in mind if you ever decide to sort very large files in a
>> pipeline. [...]
>
>In whatever way some instance of sort is implemented (memory, or
>temporary files, or whatever), my expectation is that
> whatever | sort
>will have to produce sorted output .- Isn't that guaranteed?

Actually, I may be wrong about this. May have posted too quickly.

The bad case would be if a program produced a ton of output, but the reader
didn't read any of it. I'll have to think some more as to whether or not
that applies here.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/GodDelusion

Re: The size of pipes (Was: sort by multiple columns)

<g1yyYGUXy9WtsF5ey@bongo-ra.co>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6963&group=comp.unix.shell#6963

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Sun, 23 Apr 2023 16:05:37 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <g1yyYGUXy9WtsF5ey@bongo-ra.co>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de> <slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com> <0fw8oLb25z6qFt02a@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 23 Apr 2023 16:05:37 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2494b4dfcde8cbd5f394c874115563db";
logging-data="4074916"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX180YQ+0M1TCiqLWNBQa17d/"
Cancel-Lock: sha1:ZR9P2SkX2+U6Frzhy0v3liSRMaI=
X-Server-Commands: nowebcancel
In-Reply-To: <0fw8oLb25z6qFt02a@bongo-ra.co>
X-Organisation: Weyland-Yutani
 by: Spiros Bousbouras - Sun, 23 Apr 2023 16:05 UTC

On Sun, 23 Apr 2023 15:51:42 -0000 (UTC)
Spiros Bousbouras <spibou@gmail.com> wrote:
> I don't see the problem. If sort is on the left of a pipe then it will
> sort its whole input and then all it will do is write to the pipe. If sort
> is on the right of a pipe then in the beginning it will only do reading
> until it has read everything and then do the sorting. Obviously if you
> have process1 | process2 and one side does reading or writing (whatever
> applies) much slower than the other side then the fast side will block

To be precise , *may* block if the amount of data going through the pipe is
large enough.

> but there's nothing special with sort about that.

Re: The size of pipes (Was: sort by multiple columns)

<u23lqe$3sg1f$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6964&group=comp.unix.shell#6964

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Sun, 23 Apr 2023 18:19:25 +0200
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <u23lqe$3sg1f$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<0fw8oLb25z6qFt02a@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 16:19:26 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="4079663"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/a7Z3iMNHYfb6KouW0g5wP"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:T8vgLvXZlZDYn0Vj7KPvQRuclro=
In-Reply-To: <0fw8oLb25z6qFt02a@bongo-ra.co>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 23 Apr 2023 16:19 UTC

On 23.04.2023 17:51, Spiros Bousbouras wrote:
>
> [...] If sort
> is on the right of a pipe then in the beginning it will only do reading
> until it has read everything and then do the sorting. [...]

This is [in principle] not necessarily the case. The sort algorithm
can start to sort subsets of the stream to create runs of already
sorted sequences. Mergesort, for example, is a good candidate for
such a process; it can use (e.g.) Heapsort to create larger runs in
memory and then needs less merge-runs (which are typically costly
if that's done over files). How much data the Heapsort will process
may vary, but a size of magnitude of the pipe-buffer is reasonable.

Disclaimer: I don't know how Unix'es 'sort' is typically implemented,
but I expect some sophisticated implementation, since what I wrote
above is decades old knowledge (at least since the 1980's - when I
implemented some hybrid sorting algorithms -, or maybe even back to
Donald Knuth's work; but I don't recall whether it's covered in his
"Searching and Sorting" book).

Janis

Re: The size of pipes

<u23mmi$3slm9$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6965&group=comp.unix.shell#6965

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 18:34:25 +0200
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <u23mmi$3slm9$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<oh2ghj-veh.ln1@mail.home.palmen-it.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 16:34:26 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="4085449"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+VQxqgMnxGS/i8MWgDXf6x"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:xpfo8gv6pCpI83ouUZPuUjHBkC8=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <oh2ghj-veh.ln1@mail.home.palmen-it.de>
 by: Janis Papanagnou - Sun, 23 Apr 2023 16:34 UTC

On 23.04.2023 18:21, Felix Palmen wrote:
>
> This won't be a concern here. You need the whole data to sort something,
> so the sort utility must read until EOF anyways before doing its work.

See my recent reply on a different view.

> So, the real concern is whether you'll have enough RAM.

Not if sorting is (alternatively or also) done over files.

Even at times when 640k was considered immense memory by some, much
larger data sets had been sorted even then. (Speaking about real OS
computers, not about toys). In earth-bound computers there usually
was much more disk/drum/tape memory than kernel memory available.
Even if that's "legacy" the principles are still the same. - Unless
responsible folks start putting everything into a global memory
cloud. :-/

Janis

Re: The size of pipes

<op.13u4c02ea3w0dxdave@hodgins.homeip.net>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6966&group=comp.unix.shell#6966

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dwhodgins@nomail.afraid.org (David W. Hodgins)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 12:35:14 -0400
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <op.13u4c02ea3w0dxdave@hodgins.homeip.net>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<oh2ghj-veh.ln1@mail.home.palmen-it.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="b43ae20efee7944fb0f321ac6b22f8a7";
logging-data="4085765"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2Il2SPkYMBXvfR4UvntPucDe7IaOCWAg="
User-Agent: Opera Mail/12.16 (Linux)
Cancel-Lock: sha1:xt/qp138UCXLC424DPs4ceblq3A=
 by: David W. Hodgins - Sun, 23 Apr 2023 16:35 UTC

On Sun, 23 Apr 2023 12:21:44 -0400, Felix Palmen <felix@palmen-it.de> wrote:
> This won't be a concern here. You need the whole data to sort something,
> so the sort utility must read until EOF anyways before doing its work.
> So, the real concern is whether you'll have enough RAM.

Yes the sort has to read the entire input file before it can write anything
to the output file as the last record read may have to be the first one
written.

The way that's handled in low ram systems is to use temporary files, where it
sorts chunks into each temporary file and then merges the temporary files to
create the final output file.

By default the temporary files are stored in /tmp, which on most systems is
now a virtual file system kept in ram.

Either ensure /tmp is mounted on a disk files system with enough free space
or instruct sort to use another directory.

See "man sort" for the -T (aka --temporary-directory=DIR) option.

Regards, Dave Hodgins

Re: The size of pipes (Was: sort by multiple columns)

<u23mqv$3shd2$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6967&group=comp.unix.shell#6967

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: richard.nospam@gmail.com (Richard Harnden)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Sun, 23 Apr 2023 17:36:47 +0100
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <u23mqv$3shd2$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<0fw8oLb25z6qFt02a@bongo-ra.co> <u23lqe$3sg1f$1@dont-email.me>
Reply-To: nospam.harnden@gmail.com
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 16:36:47 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="63e1e5f89f3fff23e8b4ad5b68380127";
logging-data="4081058"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18El+TADucvm0Ojjgw3N2izK4Xlfaz7Fxg="
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.10.0
Cancel-Lock: sha1:BxQXqFWzAJjgEEM83CaIdoa06XU=
In-Reply-To: <u23lqe$3sg1f$1@dont-email.me>
 by: Richard Harnden - Sun, 23 Apr 2023 16:36 UTC

On 23/04/2023 17:19, Janis Papanagnou wrote:
> On 23.04.2023 17:51, Spiros Bousbouras wrote:
>>
>> [...] If sort
>> is on the right of a pipe then in the beginning it will only do reading
>> until it has read everything and then do the sorting. [...]
>
> This is [in principle] not necessarily the case. The sort algorithm
> can start to sort subsets of the stream to create runs of already
> sorted sequences. Mergesort, for example, is a good candidate for
> such a process; it can use (e.g.) Heapsort to create larger runs in
> memory and then needs less merge-runs (which are typically costly
> if that's done over files). How much data the Heapsort will process
> may vary, but a size of magnitude of the pipe-buffer is reasonable.
>
> Disclaimer: I don't know how Unix'es 'sort' is typically implemented,
> but I expect some sophisticated implementation, since what I wrote
> above is decades old knowledge (at least since the 1980's - when I
> implemented some hybrid sorting algorithms -, or maybe even back to
> Donald Knuth's work; but I don't recall whether it's covered in his
> "Searching and Sorting" book).

My man page says:

--radixsort
Try to use radix sort, if the sort specifications allow.
The radix sort can only be used for trivial locales (C and
POSIX), and it cannot be used for numeric or month sort.
Radix sort is very fast and stable.

--mergesort
Use mergesort. This is a universal algorithm that can
always be used, but it is not always the fastest.

--qsort
Try to use quick sort, if the sort specifications allow.
This sort algorithm cannot be used with -u and -s.

--heapsort
Try to use heap sort, if the sort specifications allow.
This sort algorithm cannot be used with -u and -s.

Re: The size of pipes

<1n4ghj-rti.ln1@mail.home.palmen-it.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6968&group=comp.unix.shell#6968

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: felix@palmen-it.de (Felix Palmen)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 18:58:41 +0200
Organization: palmen-it.de
Lines: 27
Message-ID: <1n4ghj-rti.ln1@mail.home.palmen-it.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de> <slrnu4a5im.34b.t-usenet@ID-685.user.individual.de> <op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com> <oh2ghj-veh.ln1@mail.home.palmen-it.de> <u23mmi$3slm9$1@dont-email.me>
Injection-Date: Sun, 23 Apr 2023 18:58:41 +0200
Injection-Info: dont-email.me; posting-host="b356e10f704f0fe7f2bc4230b5cdf758";
logging-data="4095217"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/tFejWkKee63ja4DKc81Jv"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (FreeBSD/13.2-RELEASE (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha256:+Lxnm6XE8tlrEjWeepL4dkmp/x4dPGpOr7iw9p0+TIo=
sha1:bqwrW+LclZnTSyQEbe8+/++TmB4=
X-PGP-Key: 693613D55BBF4837B2123ACC54ADE0069879F231
X-PGP-Hash: SHA256
X-PGP-Sig: GnuPG-v2 From,Newsgroups,Subject,Date,Injection-Date,Message-ID
iNUEARYIAH0WIQRpNhPVW79IN7ISOsxUreAGmHnyMQUCZEVjwV8UgAAAAAAuAChp
c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0Njkz
NjEzRDU1QkJGNDgzN0IyMTIzQUNDNTRBREUwMDY5ODc5RjIzMQAKCRBUreAGmHny
MS3UAP9euSYDCcHok8c72M9N8xp1EYJcfVBlZTnWamiDdoy1wgD/cKh3guNK2VKQ
dCaiHuVuimXkLCoPPDIV1D+wI4jFngE=
=fu2I
 by: Felix Palmen - Sun, 23 Apr 2023 16:58 UTC

* Janis Papanagnou <janis_papanagnou+ng@hotmail.com>:
> On 23.04.2023 18:21, Felix Palmen wrote:
>>
>> This won't be a concern here. You need the whole data to sort something,
>> so the sort utility must read until EOF anyways before doing its work.
>
> See my recent reply on a different view.

So, even if it starts working on "chunks", this won't change anything:
the data from the pipe must be read in order to work with it, so the
size of the pipe won't be a problem here.

It seems the idea assuming this was that the whole data to be sorted
must fit into the pipe buffer. But this isn't the case.

>> So, the real concern is whether you'll have enough RAM.
>
> Not if sorting is (alternatively or also) done over files.

Sure this *can* be done, that's why I mentioned the possibility. I
wasn't aware sort utils these days actually do it.

--
Dipl.-Inform. Felix Palmen <felix@palmen-it.de> ,.//..........
{web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de
{pgp public key} http://palmen-it.de/pub.txt // """""""""""
{pgp fingerprint} 6936 13D5 5BBF 4837 B212 3ACC 54AD E006 9879 F231

Re: The size of pipes

<u23p59$3t549$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6969&group=comp.unix.shell#6969

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!news.nntp4.net!paganini.bofh.team!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 19:16:25 +0200
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <u23p59$3t549$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<oh2ghj-veh.ln1@mail.home.palmen-it.de> <u23mmi$3slm9$1@dont-email.me>
<1n4ghj-rti.ln1@mail.home.palmen-it.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 17:16:25 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="4101257"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18NaCMJrEBt3jIvCKxvIUsw"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:eFuv3Fljk/QwfbMqKraCEjklK1w=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <1n4ghj-rti.ln1@mail.home.palmen-it.de>
 by: Janis Papanagnou - Sun, 23 Apr 2023 17:16 UTC

On 23.04.2023 18:58, Felix Palmen wrote:
> * Janis Papanagnou <janis_papanagnou+ng@hotmail.com>:
>> On 23.04.2023 18:21, Felix Palmen wrote:
>>>
>>> This won't be a concern here. You need the whole data to sort something,
>>> so the sort utility must read until EOF anyways before doing its work.

s/doing/finishing/

>> See my recent reply on a different view.
>
> So, even if it starts working on "chunks", this won't change anything:
> the data from the pipe must be read in order to work with it, so the
> size of the pipe won't be a problem here.
>
> It seems the idea assuming this was that the whole data to be sorted
> must fit into the pipe buffer. But this isn't the case.

It boils down to this; sorting can _start_ sorting with fewer data
(something like a pipe-full), it can also _continue_ sorting with
more parts of data, and to _finish_ sorting it naturally must have
had all data available.

Janis

Re: The size of pipes (Was: sort by multiple columns)

<T0w1M.321485$0dpc.50826@fx33.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6970&group=comp.unix.shell#6970

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
From: vallor@vallor.earth (vallor)
Subject: Re: The size of pipes (Was: sort by multiple columns)
Newsgroups: comp.unix.shell
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
<u23fpe$2opsm$1@news.xmission.com> <kalg1gF3o61U1@mid.individual.net>
MIME-Version: 1.0
User-Agent: Pan/0.154 (Izium; dfc8674 gitlab.gnome.org/GNOME/pan.git)
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Lines: 38
Message-ID: <T0w1M.321485$0dpc.50826@fx33.iad>
X-Complaints-To: abuse@blocknews.net
NNTP-Posting-Date: Mon, 24 Apr 2023 14:05:39 UTC
Organization: blocknews - www.blocknews.net
Date: Mon, 24 Apr 2023 14:05:39 GMT
X-Received-Bytes: 2364
 by: vallor - Mon, 24 Apr 2023 14:05 UTC

On Sun, 23 Apr 2023 15:42:00 -0400, John-Paul Stewart wrote:

> On 4/23/23 10:36, Kenny McCormack wrote:
>> This actually raises an interesting point. Pipes are not infinite in
>> size,
>> and they could, theoretically block if enough is written on the write
>> end without anything being read from the read end. Though the limits
>> are likely very large nowadays on modern systems, I think the original
>> implementation was only 4096 bytes and the standards today (POSIX) may
>> not guarantee anything more than that (haven't checked).
>
> FWIW, the pipe(7) manpage from Debian GNU/Linux has a "Pipe capacity"
> section that says in part:
>
> Before Linux 2.6.11, the capacity of a pipe was the same as the
> system page size (e.g., 4096 bytes on i386). Since Linux
> 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a
> system with a page size of 4096 bytes). Since Linux 2.6.35, the
> default pipe capacity is 16 pages, but the capacity can be queried
> and set using the fcntl(2) F_GETPIPE_SZ and F_SET‐
> PIPE_SZ operations. See fcntl(2) for more information.
>
> So pipes on Linux aren't very large at all. I don't know how other Unix
> systems compare.

Could the actual pipe size perhaps be queried
and set with "ulimit"?

$ ulimit -a
[...]
pipe size (512 bytes, -p) 8
[...]

With: GNU bash, version 5.1.16
("help ulimit" for docs on the shell built-in...)

--
-v (Scott)

Re: The size of pipes

<eh6ghj-p7k.ln1@mail.home.palmen-it.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6971&group=comp.unix.shell#6971

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: felix@palmen-it.de (Felix Palmen)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 19:29:50 +0200
Organization: palmen-it.de
Lines: 19
Message-ID: <eh6ghj-p7k.ln1@mail.home.palmen-it.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de> <slrnu4a5im.34b.t-usenet@ID-685.user.individual.de> <op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com> <oh2ghj-veh.ln1@mail.home.palmen-it.de> <u23mmi$3slm9$1@dont-email.me> <1n4ghj-rti.ln1@mail.home.palmen-it.de> <u23p59$3t549$1@dont-email.me>
Injection-Date: Sun, 23 Apr 2023 19:29:50 +0200
Injection-Info: dont-email.me; posting-host="b356e10f704f0fe7f2bc4230b5cdf758";
logging-data="4105124"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19nT9e75ChQTa3Oj3ys98K0"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (FreeBSD/13.2-RELEASE (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha256:Do7wpYAUJvcu91jCz64NOQIi/O8Het+ojMMNLoEB2KA=
sha1:WO0oXiKVhW/6/Sy7JpNh8a8RKlU=
X-PGP-Sig: GnuPG-v2 From,Newsgroups,Subject,Date,Injection-Date,Message-ID
iNUEARYIAH0WIQRpNhPVW79IN7ISOsxUreAGmHnyMQUCZEVrDl8UgAAAAAAuAChp
c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0Njkz
NjEzRDU1QkJGNDgzN0IyMTIzQUNDNTRBREUwMDY5ODc5RjIzMQAKCRBUreAGmHny
MfVKAQC1IVA+TPzcLxT5tYBWw0WcoL5WSt0vzonzcYLqLBYrwQEAypWCNFvRPIyX
2QmlZrYzS1Vm2qhxA1sj2yHO94zCRwc=
=OVnC
X-PGP-Hash: SHA256
X-PGP-Key: 693613D55BBF4837B2123ACC54ADE0069879F231
 by: Felix Palmen - Sun, 23 Apr 2023 17:29 UTC

* Janis Papanagnou <janis_papanagnou+ng@hotmail.com>:
> s/doing/finishing/

Agreed.

> It boils down to this; sorting can _start_ sorting with fewer data
> (something like a pipe-full), it can also _continue_ sorting with
> more parts of data, and to _finish_ sorting it naturally must have
> had all data available.

All correct, but I really doubt the relevance of the parantheses. The
size of the pipe will never be of much interest (except maybe for
performance), mostly because you can't seek a pipe anyways.

--
Dipl.-Inform. Felix Palmen <felix@palmen-it.de> ,.//..........
{web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de
{pgp public key} http://palmen-it.de/pub.txt // """""""""""
{pgp fingerprint} 6936 13D5 5BBF 4837 B212 3ACC 54AD E006 9879 F231

Re: The size of pipes (Was: sort by multiple columns)

<u265ou$csqa$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6972&group=comp.unix.shell#6972

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Mon, 24 Apr 2023 17:03:57 +0200
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <u265ou$csqa$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<kalg1gF3o61U1@mid.individual.net> <T0w1M.321485$0dpc.50826@fx33.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 24 Apr 2023 15:03:58 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f09a3cd4cc3a6ecbcc1e6f56df694ecf";
logging-data="422730"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jpeNpq/ad86SeHHI1g51p"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:PJPz92ZW26j+9YKkhV0QoNZrdjs=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <T0w1M.321485$0dpc.50826@fx33.iad>
 by: Janis Papanagnou - Mon, 24 Apr 2023 15:03 UTC

On 24.04.2023 16:05, vallor wrote:
>
> Could the actual pipe size perhaps be queried
> and set with "ulimit"?
>
> $ ulimit -a
> [...]
> pipe size (512 bytes, -p) 8
> [...]
>
> With: GNU bash, version 5.1.16
> ("help ulimit" for docs on the shell built-in...)

It's quite funny that every shell has its own formats; in bash you
have to do the math (8x512) while in ksh it's 4096. Other quantities
have different scaling, e.g. bytes vs. Kibytes. And some have units
not defined (in ulimit or ulimit --man), like "blocks".

# bash
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
file size (blocks, -f) unlimited

# ksh
pipe buffer size (bytes) (-p) 4096
message queue size (Kibytes) (-q) 800
file size (blocks) (-f) unlimited

And zsh's ulimit "doesn't know" pipe size?

Janis

Re: The size of pipes

<u23rb0$3tfnq$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6973&group=comp.unix.shell#6973

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 19:53:36 +0200
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <u23rb0$3tfnq$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<oh2ghj-veh.ln1@mail.home.palmen-it.de> <u23mmi$3slm9$1@dont-email.me>
<1n4ghj-rti.ln1@mail.home.palmen-it.de> <u23p59$3t549$1@dont-email.me>
<eh6ghj-p7k.ln1@mail.home.palmen-it.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 17:53:37 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="4112122"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Rf8Vd2AUN0uOsdA2diU5s"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:gUuTYjWCM8CGCgzExKO/mbuMxEI=
In-Reply-To: <eh6ghj-p7k.ln1@mail.home.palmen-it.de>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 23 Apr 2023 17:53 UTC

On 23.04.2023 19:29, Felix Palmen wrote:
> * Janis Papanagnou <janis_papanagnou+ng@hotmail.com>:
>> s/doing/finishing/
>
> Agreed.
>
>> It boils down to this; sorting can _start_ sorting with fewer data
>> (something like a pipe-full), it can also _continue_ sorting with
>> more parts of data, and to _finish_ sorting it naturally must have
>> had all data available.
>
> All correct, but I really doubt the relevance of the parantheses.

It's here just to demonstrate a magnitude, no less, no more.

But I seem to recall - faint memories from 4 decades ago - that
I/O-buffer size (similar to pipe-buffer size) was part of the
rationale about why to use such values and how to dimension it
(for optimum processing speed, yes, for performance as you say
below).

> The
> size of the pipe will never be of much interest (except maybe for
> performance), mostly because you can't seek a pipe anyways.

Seeking on the pipe isn't necessary since the pipe is just the
transfer medium, unstructured per se, with data likely even
truncated at the front or rear (because of octet-transmission,
not data-record processing). You'll anyway have it transferred
into a structured memory structure.

Janis

Re: sort by multiple columns

<u236p9$3q0r5$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6974&group=comp.unix.shell#6974

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: sort by multiple columns
Date: Sun, 23 Apr 2023 14:02:49 +0200
Organization: A noiseless patient Spider
Lines: 27
Message-ID: <u236p9$3q0r5$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<83fs8sn1jc.fsf@helmutwaitzmann.news.arcor.de>
<slrnu471bk.34b.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 23 Apr 2023 12:02:50 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="3998565"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19MA/JAivSPaEy9U5OjWjrx"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:MZjiegKSyQh97CVyh03Sq6KHRxU=
In-Reply-To: <slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 23 Apr 2023 12:02 UTC

On 23.04.2023 13:28, Martin Τrautmann wrote:
> On Sun, 23 Apr 2023 03:33:47 +0200, Helmut Waitzmann wrote:
>>> If I want to pre-sort by 3 first, then sub-sort by column 2,
>>> that's fine. But when I pipe one sort to the other, the second
>>> sort will destroy the sort before. That's why i had my sort
>>> order in reverted order, using a pipe example.
>>
>> That won't help, either: A sorting pipe using (a standard)
>> "sort" won't solve the problem, because one cannot tell (a
>> standard) "sort" to do a sort on the given key option only. Each
>> sort in the pipe will be total (according to its sort criteria)
>> of its own.
>
> That was my problem - I expected that a pipe through several sorts would
> keep the order. I don't know why it doesn't.

Because sorting on one criterion generally doesn't impose any
restrictions on other criteria. By that sorting can be made a
very efficient implementation. But that's what stable sorting
is for; to make some provisions for specific ordering cases,
how to handle the set of records with equal keys. With Unix'es
'sort' implementation being and able to specify multiple keys
there's of course less need to separate sorting with pipes to
several distinct processes.

Janis

Re: The size of pipes

<u23sob$3tmbl$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6975&group=comp.unix.shell#6975

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!news.nntp4.net!news.gegeweb.eu!gegeweb.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Sun, 23 Apr 2023 20:17:47 +0200
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <u23sob$3tmbl$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com>
<oh2ghj-veh.ln1@mail.home.palmen-it.de> <u23mmi$3slm9$1@dont-email.me>
<1n4ghj-rti.ln1@mail.home.palmen-it.de>
<op.13u7oirpa3w0dxdave@hodgins.homeip.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 18:17:47 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="179f52437eefbb5747dfb2f8e88f4c7c";
logging-data="4118901"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18gaYPqQnrJkhQMSnS4/tjq"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:p5bD2ec8TjxOZgzIZq884rGw9nc=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <op.13u7oirpa3w0dxdave@hodgins.homeip.net>
 by: Janis Papanagnou - Sun, 23 Apr 2023 18:17 UTC

On 23.04.2023 19:46, David W. Hodgins wrote:
>
> If I'm reading it right, it always uses temporary files doing a sort/merge.
> Given that it started in 1988, it's not surprising that it's designed to
> work in a low ram environment.

Some test-run[*] finished here...

The data created and fed into 'sort' is larger than my free RAM.

$ time seq 1000000000 -1 1 | sort -n | N=1 is-sorted
0

real 58m8.18s
user 54m18.16s
sys 1m49.34s

Janis

[*] 'is-sorted' is an awk script, and "0" means it's okay (=sorted),

Re: The size of pipes (Was: sort by multiple columns)

<20230424094659.809@kylheku.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6976&group=comp.unix.shell#6976

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Mon, 24 Apr 2023 16:50:42 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 12
Message-ID: <20230424094659.809@kylheku.com>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
<u23fpe$2opsm$1@news.xmission.com> <kalg1gF3o61U1@mid.individual.net>
<op.13vd5llna3w0dxdave@hodgins.homeip.net>
Injection-Date: Mon, 24 Apr 2023 16:50:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f30ec808bf3b4fd8240c1ac55a3aef4a";
logging-data="453954"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/yX6rf4MsH5Torj17WDq/pdOkGjyLO0m8="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:Bo1JnKW4eViWGVbAsDc4em25Qr8=
 by: Kaz Kylheku - Mon, 24 Apr 2023 16:50 UTC

On 2023-04-23, David W. Hodgins <dwhodgins@nomail.afraid.org> wrote:
> On Sun, 23 Apr 2023 15:42:00 -0400, John-Paul Stewart <jpstewart@personalprojects.net> wrote:
>> So pipes on Linux aren't very large at all. I don't know how other Unix
>> systems compare.
>
> The pipe only has to store a minimum of one buffer of data. If the process

In fact, I suspect, a pipe doesn't have to store anything. It can be a
pure rendezvous. The write() call can block until the reader performs a
read(), or vice versa, at which time MIN(read_size, write_size) bytes
can be transferred directly between their respective buffers, that value
then being returned from the read and write.

Re: The size of pipes

<oipihj-e011.ln1@mail.home.palmen-it.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6977&group=comp.unix.shell#6977

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: felix@palmen-it.de (Felix Palmen)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes
Date: Mon, 24 Apr 2023 19:07:04 +0200
Organization: palmen-it.de
Lines: 17
Message-ID: <oipihj-e011.ln1@mail.home.palmen-it.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de> <slrnu4a5im.34b.t-usenet@ID-685.user.individual.de> <op.13uwd4i8a3w0dxdave@hodgins.homeip.net> <u23fpe$2opsm$1@news.xmission.com> <kalg1gF3o61U1@mid.individual.net> <op.13vd5llna3w0dxdave@hodgins.homeip.net> <20230424094659.809@kylheku.com>
Injection-Date: Mon, 24 Apr 2023 19:07:04 +0200
Injection-Info: dont-email.me; posting-host="813d562a6de3e568548b7a3702f41ce9";
logging-data="460860"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+gmVFJFq8C39/42bUKJwDw"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (FreeBSD/13.2-RELEASE (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha256:HyIRGJL3YSvNnrzUGT0PsF8rt1ldV3d7IJcvZtIsAI4=
sha1:A0ZBVTCiZkCSu997efa8wQkzlqY=
X-PGP-Sig: GnuPG-v2 From,Newsgroups,Subject,Date,Injection-Date,Message-ID
iNUEARYIAH0WIQRpNhPVW79IN7ISOsxUreAGmHnyMQUCZEa3OF8UgAAAAAAuAChp
c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0Njkz
NjEzRDU1QkJGNDgzN0IyMTIzQUNDNTRBREUwMDY5ODc5RjIzMQAKCRBUreAGmHny
MUTZAQCCfnq6fT5ObKqRMl6SUqsFsYCgY1xW/BU680H5FpVvcAEAsAfpb2uMOiCP
J2fUWljPPSbXH1tVAWfT3C4WdavJZgQ=
=5M6p
X-PGP-Hash: SHA256
X-PGP-Key: 693613D55BBF4837B2123ACC54ADE0069879F231
 by: Felix Palmen - Mon, 24 Apr 2023 17:07 UTC

* Kaz Kylheku <864-117-4973@kylheku.com>:
> In fact, I suspect, a pipe doesn't have to store anything. It can be a
> pure rendezvous. The write() call can block until the reader performs a
> read(), or vice versa, at which time MIN(read_size, write_size) bytes
> can be transferred directly between their respective buffers, that value
> then being returned from the read and write.

Yes. IIRC, L4 uses some similar mechanism for IPC. It needs support from
the scheduler of course. And to make it most efficient, the size should
be agreed upon on both sides, so that won't work with typical pipe
semantics.

--
Dipl.-Inform. Felix Palmen <felix@palmen-it.de> ,.//..........
{web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de
{pgp public key} http://palmen-it.de/pub.txt // """""""""""
{pgp fingerprint} 6936 13D5 5BBF 4837 B212 3ACC 54AD E006 9879 F231

Re: The size of pipes (Was: sort by multiple columns)

<6tukhj-bl1.ln1@ID-313840.user.individual.net>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6980&group=comp.unix.shell#6980

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!weretis.net!feeder8.news.weretis.net!lilly.ping.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: geoff@clare.See-My-Signature.invalid (Geoff Clare)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Tue, 25 Apr 2023 13:50:14 +0100
Lines: 44
Message-ID: <6tukhj-bl1.ln1@ID-313840.user.individual.net>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
<u23fpe$2opsm$1@news.xmission.com> <kalg1gF3o61U1@mid.individual.net>
<T0w1M.321485$0dpc.50826@fx33.iad> <u265ou$csqa$1@dont-email.me>
Reply-To: netnews@gclare.org.uk
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net Q1Blk8/dgq3evny+/hYrQwPboXr5l2Qs8tIPhNgwaQ3RM5VsYm
X-Orig-Path: ID-313840.user.individual.net!not-for-mail
Cancel-Lock: sha1:BLA9hFa4kZNbW263Qpm6iKuMTiE=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Geoff Clare - Tue, 25 Apr 2023 12:50 UTC

Janis Papanagnou wrote:

> On 24.04.2023 16:05, vallor wrote:
>>
>> Could the actual pipe size perhaps be queried
>> and set with "ulimit"?
>>
>> $ ulimit -a
>> [...]
>> pipe size (512 bytes, -p) 8
>> [...]
>>
>> With: GNU bash, version 5.1.16
>> ("help ulimit" for docs on the shell built-in...)
>
> It's quite funny that every shell has its own formats; in bash you
> have to do the math (8x512) while in ksh it's 4096.

I believe the value ulimit is giving here is PIPE_BUF, not the
capacity of the pipe.

On my Linux system, much more than 4096 bytes can be written to
a pipe without anything being read from it:

$ dd if=/dev/zero | sleep 10
^C129+0 records in
128+0 records out
65536 bytes (66 kB, 64 KiB) copied, 2.04325 s, 32.1 kB/s

(I used Ctrl-C to send dd a SIGINT.)

$ ulimit -a | grep pipe
pipe size (512 bytes, -p) 8
$ getconf PIPE_BUF .
4096

In any case, on some systems "pipe capacity" is not a simple concept.
SVR4's STREAMS-based pipes have separate high-water and low-water
thresholds. (The writer blocks when high-water is reached but
doesn't unblock until enough has been read to take the level below
low-water.)

--
Geoff Clare <netnews@gclare.org.uk>

Re: The size of pipes (Was: sort by multiple columns)

<u28kkc$2rdpu$1@news.xmission.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6981&group=comp.unix.shell#6981

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Tue, 25 Apr 2023 13:29:48 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <u28kkc$2rdpu$1@news.xmission.com>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de> <T0w1M.321485$0dpc.50826@fx33.iad> <u265ou$csqa$1@dont-email.me> <6tukhj-bl1.ln1@ID-313840.user.individual.net>
Injection-Date: Tue, 25 Apr 2023 13:29:48 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="2996030"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Tue, 25 Apr 2023 13:29 UTC

In article <6tukhj-bl1.ln1@ID-313840.user.individual.net>,
Geoff Clare <netnews@gclare.org.uk> wrote:
....
>On my Linux system, much more than 4096 bytes can be written to
>a pipe without anything being read from it:
>
>$ dd if=/dev/zero | sleep 10
>^C129+0 records in
>128+0 records out
>65536 bytes (66 kB, 64 KiB) copied, 2.04325 s, 32.1 kB/s
>
>(I used Ctrl-C to send dd a SIGINT.)

Didn't somebody say upthread that the default limit on Linux is 64K?
So, kinda funny that you chose exactly 64K for your demonstration.

Anyway, you can (according to those same people) bump it up to 1M. if
needed.

--
People who want to share their religious views with you
almost never want you to share yours with them. -- Dave Barry

Re: The size of pipes (Was: sort by multiple columns)

<op.13ypb4wwa3w0dxdave@hodgins.homeip.net>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6982&group=comp.unix.shell#6982

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dwhodgins@nomail.afraid.org (David W. Hodgins)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Tue, 25 Apr 2023 11:01:06 -0400
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <op.13ypb4wwa3w0dxdave@hodgins.homeip.net>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<T0w1M.321485$0dpc.50826@fx33.iad> <u265ou$csqa$1@dont-email.me>
<6tukhj-bl1.ln1@ID-313840.user.individual.net>
<u28kkc$2rdpu$1@news.xmission.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="86d75af1af806a64f4c895bcc54902b8";
logging-data="975910"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/4FFRKtdyOcdNbfl+xY8X57SCfrQF+7+4="
User-Agent: Opera Mail/12.16 (Linux)
Cancel-Lock: sha1:yYiraMXMhBYLQCaKXQrTwSLAiWg=
 by: David W. Hodgins - Tue, 25 Apr 2023 15:01 UTC

On Tue, 25 Apr 2023 09:29:48 -0400, Kenny McCormack <gazelle@shell.xmission.com> wrote:

> In article <6tukhj-bl1.ln1@ID-313840.user.individual.net>,
> Geoff Clare <netnews@gclare.org.uk> wrote:
> ...
>> On my Linux system, much more than 4096 bytes can be written to
>> a pipe without anything being read from it:
>>
>> $ dd if=/dev/zero | sleep 10
>> ^C129+0 records in
>> 128+0 records out
>> 65536 bytes (66 kB, 64 KiB) copied, 2.04325 s, 32.1 kB/s
>>
>> (I used Ctrl-C to send dd a SIGINT.)
>
> Didn't somebody say upthread that the default limit on Linux is 64K?
> So, kinda funny that you chose exactly 64K for your demonstration.
>
> Anyway, you can (according to those same people) bump it up to 1M. if
> needed.

It stopped after filling the output buffer, not the pipe. That data was still
waiting to be written to the pipe when the dd command was terminated.

Regards, Dave Hodgins

Re: The size of pipes (Was: sort by multiple columns)

<vghnhj-59u.ln1@ID-313840.user.individual.net>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6991&group=comp.unix.shell#6991

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: geoff@clare.See-My-Signature.invalid (Geoff Clare)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Wed, 26 Apr 2023 13:20:15 +0100
Lines: 48
Message-ID: <vghnhj-59u.ln1@ID-313840.user.individual.net>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<T0w1M.321485$0dpc.50826@fx33.iad> <u265ou$csqa$1@dont-email.me>
<6tukhj-bl1.ln1@ID-313840.user.individual.net>
<u28kkc$2rdpu$1@news.xmission.com>
<op.13ypb4wwa3w0dxdave@hodgins.homeip.net>
Reply-To: netnews@gclare.org.uk
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net Liz/IcQdCIV6a0UGC0mGPQvROZF8oHUebHo6TFh14c3OVvwf8q
X-Orig-Path: ID-313840.user.individual.net!not-for-mail
Cancel-Lock: sha1:GYIMranAMoEOpoZR27I5dt/0V4k=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Geoff Clare - Wed, 26 Apr 2023 12:20 UTC

David W. Hodgins wrote:

> On Tue, 25 Apr 2023 09:29:48 -0400, Kenny McCormack <gazelle@shell.xmission.com> wrote:
>
>> In article <6tukhj-bl1.ln1@ID-313840.user.individual.net>,
>> Geoff Clare <netnews@gclare.org.uk> wrote:
>> ...
>>> On my Linux system, much more than 4096 bytes can be written to
>>> a pipe without anything being read from it:
>>>
>>> $ dd if=/dev/zero | sleep 10
>>> ^C129+0 records in
>>> 128+0 records out
>>> 65536 bytes (66 kB, 64 KiB) copied, 2.04325 s, 32.1 kB/s
>>>
>>> (I used Ctrl-C to send dd a SIGINT.)
>>
>> Didn't somebody say upthread that the default limit on Linux is 64K?
>> So, kinda funny that you chose exactly 64K for your demonstration.

I didn't actively choose 64K. I haven't ever changed the pipe size on
a Linux system, so the size used was whatever is the default.

>> Anyway, you can (according to those same people) bump it up to 1M. if
>> needed.
>
> It stopped after filling the output buffer, not the pipe. That data was still
> waiting to be written to the pipe when the dd command was terminated.

Only one block (of 512 bytes) was waiting to be written. A feature
of dd is that it reads and writes exactly the block sizes you tell it
to (or 512 bytes by default). The dd output:

128+0 records out

means it had successfully written 128 blocks (of 512 bytes) to the pipe
when it exited. The "129+0 records in" is what shows it had read one
extra block that was waiting to be written.

If I tell dd to read and write one byte at a time, it does exactly that:

$ dd bs=1 if=/dev/zero | sleep 10
^C65537+0 records in
65536+0 records out
65536 bytes (66 kB, 64 KiB) copied, 4.88295 s, 13.4 kB/s

--
Geoff Clare <netnews@gclare.org.uk>

Re: The size of pipes (Was: sort by multiple columns)

<slrnu4ietc.pp5.whynot@orphan.zombinet>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=6996&group=comp.unix.shell#6996

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: whynot@pozharski.name (Eric Pozharski)
Newsgroups: comp.unix.shell
Subject: Re: The size of pipes (Was: sort by multiple columns)
Date: Wed, 26 Apr 2023 14:56:44 +0000
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <slrnu4ietc.pp5.whynot@orphan.zombinet>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
<u23fpe$2opsm$1@news.xmission.com> <kalg1gF3o61U1@mid.individual.net>
<T0w1M.321485$0dpc.50826@fx33.iad> <u265ou$csqa$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="d08423d891d0f0980dce7c06df570998";
logging-data="1563378"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/5+YaEfwhSm73S03Oc8QXs"
User-Agent: slrn/pre1.0.0-18 (Linux)
Cancel-Lock: sha1:Sjf+ymy+pAu2zlV5bv7N0o+qJLE=
 by: Eric Pozharski - Wed, 26 Apr 2023 14:56 UTC

with <u265ou$csqa$1@dont-email.me> Janis Papanagnou wrote:
> On 24.04.2023 16:05, vallor wrote:

>> Could the actual pipe size perhaps be queried and set with "ulimit"?
*SKIP*
> And zsh's ulimit "doesn't know" pipe size?

Funny thing, looking through /usr/include/**/resource.h suggests that size
of pipe has nothing to do with setrlimit(2) or ulimit(3). Weird.

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

Re: sort by multiple columns

<slrnu4pqok.dao.t-usenet@ID-685.user.individual.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7018&group=comp.unix.shell#7018

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: t-usenet@gmx.net (Martin Τrautmann)
Newsgroups: comp.unix.shell
Subject: Re: sort by multiple columns
Date: Sat, 29 Apr 2023 12:01:55 +0200
Organization: slrn user
Lines: 28
Message-ID: <slrnu4pqok.dao.t-usenet@ID-685.user.individual.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<83fs8sn1jc.fsf@helmutwaitzmann.news.arcor.de>
<slrnu471bk.34b.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
Reply-To: traut@gmx.de
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="c5a98ca0132f983d27c79bc2df8ca67d";
logging-data="3062497"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+tnoz3eBqOQn0qgzP/AZ/W"
User-Agent: slrn/1.0.3 (Darwin)
Cancel-Lock: sha1:JZy3wJfQMPufdqbMO0k9dYkhdJQ=
X-No-Archive: Yes
 by: Martin Τrautmann - Sat, 29 Apr 2023 10:01 UTC

On Sun, 23 Apr 2023 09:43:06 -0400, David W. Hodgins wrote:
> On Sun, 23 Apr 2023 07:28:22 -0400, Martin Τrautmann <t-usenet@gmx.net> wrote:
>> That was my problem - I expected that a pipe through several sorts would
>> keep the order. I don't know why it doesn't.
>
> It may be easier to understand if you use a temporary files instead of pipes.
>
> Sorting the input file by column 4, numerical creating a first temporary file.
> Sort the first temporary file by column 2 creating a second temporary file.
> Sort the second temporary file by column 3 creating the output.
>
> The last sort doesn't know that the prior two sorts have been done. It just
> looks at the file it's giving and sorts it by column 3.
>
> Using a pipe just takes the output of the first and second sort and uses it
> directly as input for the next sort. All the pipe does is eliminate the
> need for a temporary file.

But if I sort by one column only, then through the pipe by another
column only, the second sort SHOULD respect the previous sort.
Unfortunately, I feel it doesn't.

> Keep in mind. When sorting a file, the last line in the input may end up becoming
> the first line in the output. The sort can not write anything to the pipe or
> output file until it's sorted the entire input. With a pipe, the temporary
> file is in ram rather then being a named file on disk.

So the sort via a file actually should work the same as via the pipe?

Re: sort by multiple columns

<slrnu4pqut.dao.t-usenet@ID-685.user.individual.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7019&group=comp.unix.shell#7019

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: t-usenet@gmx.net (Martin Τrautmann)
Newsgroups: comp.unix.shell
Subject: Re: sort by multiple columns
Date: Sat, 29 Apr 2023 12:05:17 +0200
Organization: slrn user
Lines: 41
Message-ID: <slrnu4pqut.dao.t-usenet@ID-685.user.individual.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<83fs8sn1jc.fsf@helmutwaitzmann.news.arcor.de>
<slrnu471bk.34b.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<83o7nel6kp.fsf@helmutwaitzmann.news.arcor.de>
<83jzy2l4tb.fsf@helmutwaitzmann.news.arcor.de>
Reply-To: traut@gmx.de
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="c5a98ca0132f983d27c79bc2df8ca67d";
logging-data="3062497"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uUUYocvikRjIh1/l434gt"
User-Agent: slrn/1.0.3 (Darwin)
Cancel-Lock: sha1:J+0DLNfH1Jg5zSfyxXhGHNuBY+8=
X-No-Archive: Yes
 by: Martin Τrautmann - Sat, 29 Apr 2023 10:05 UTC

On Sun, 23 Apr 2023 22:30:24 +0200, Helmut Waitzmann wrote:
> Helmut Waitzmann <nn.throttle@xoxy.net>:
>> Look at these sample lines:
>>
>>
>> 1;0
>> 1;1
>> 1;2
>> 0;0
>> 0;1
>> 0;2
>> 2;0
>> 2;1
>> 2;2
>>
>>
>> To have this sequence of lines sorted in such a way that the
>> first field is sorted in ascending numeric order while the
>> second is sorted in descending numeric order,
>
> I'm sorry, that is a quite misleading description.  What I wanted
> to say is that the sequence of lines should be sorted to look
> like
>
> 0;2
> 0;1
> 0;0
> 1;2
> 1;1
> 1;0
> 2;2
> 2;1
> 2;0
>
> and to achieve this…
>
>> one could specify the two sort criteria at once:
>>
>> sort -t ';' -k 1nb,1 -k 2nr,2

Would you achieve this via a pipe as well?

Re: sort by multiple columns

<slrnu4prll.dao.t-usenet@ID-685.user.individual.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7020&group=comp.unix.shell#7020

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: t-usenet@gmx.net (Martin Τrautmann)
Newsgroups: comp.unix.shell
Subject: Re: sort by multiple columns
Date: Sat, 29 Apr 2023 12:17:24 +0200
Organization: slrn user
Lines: 68
Message-ID: <slrnu4prll.dao.t-usenet@ID-685.user.individual.de>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<83fs8sn1jc.fsf@helmutwaitzmann.news.arcor.de>
<slrnu471bk.34b.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<83o7nel6kp.fsf@helmutwaitzmann.news.arcor.de>
<83jzy2l4tb.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4pqut.dao.t-usenet@ID-685.user.individual.de>
Reply-To: traut@gmx.de
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="c5a98ca0132f983d27c79bc2df8ca67d";
logging-data="3062497"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19c3WuuGD57EW+q1a5WUqAn"
User-Agent: slrn/1.0.3 (Darwin)
Cancel-Lock: sha1:1IIT+wpnPGMKGm0BZp6fCDR4J8Y=
X-No-Archive: Yes
 by: Martin Τrautmann - Sat, 29 Apr 2023 10:17 UTC

On Sat, 29 Apr 2023 12:05:17 +0200, Martin Τrautmann wrote:
> On Sun, 23 Apr 2023 22:30:24 +0200, Helmut Waitzmann wrote:
>> Helmut Waitzmann <nn.throttle@xoxy.net>:
>>> Look at these sample lines:
>>>
>>>
>>> 1;0
>>> 1;1
>>> 1;2
>>> 0;0
>>> 0;1
>>> 0;2
>>> 2;0
>>> 2;1
>>> 2;2
>>>
>>>
>>> To have this sequence of lines sorted in such a way that the
>>> first field is sorted in ascending numeric order while the
>>> second is sorted in descending numeric order,
>>
>> I'm sorry, that is a quite misleading description.  What I wanted
>> to say is that the sequence of lines should be sorted to look
>> like
>>
>> 0;2
>> 0;1
>> 0;0
>> 1;2
>> 1;1
>> 1;0
>> 2;2
>> 2;1
>> 2;0
>>
>> and to achieve this…
>>
>>> one could specify the two sort criteria at once:
>>>
>>> sort -t ';' -k 1nb,1 -k 2nr,2
>
> Would you achieve this via a pipe as well?

When I sort by column 2 first and only, I end up with
0;2
1;2
2;2
0;1
1;1
2;1
0;0
1;0
2;0

Why that? I would expect
1;2
0;2
2;2
1;1
0;1
2;1
1;0
0;0
2;0

So why does it resort by first column as well? Since it does that, both
a pipe and a second sort from a temporary file still fail, since they
also ignore the temporary sort of the other column.

Re: sort by multiple columns

<u2itds$2tv9u$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7021&group=comp.unix.shell#7021

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris@mshome.net (Chris Elvidge)
Newsgroups: comp.unix.shell
Subject: Re: sort by multiple columns
Date: Sat, 29 Apr 2023 12:01:14 +0100
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <u2itds$2tv9u$1@dont-email.me>
References: <slrnu3v5vd.m2.t-usenet@ID-685.user.individual.de>
<83fs8sn1jc.fsf@helmutwaitzmann.news.arcor.de>
<slrnu471bk.34b.t-usenet@ID-685.user.individual.de>
<834jp7mlfo.fsf@helmutwaitzmann.news.arcor.de>
<slrnu4a5im.34b.t-usenet@ID-685.user.individual.de>
<op.13uwd4i8a3w0dxdave@hodgins.homeip.net>
<slrnu4pqok.dao.t-usenet@ID-685.user.individual.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 29 Apr 2023 11:01:16 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="245db334b00decaa3cde205015d1b991";
logging-data="3079486"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Vn2tMLisD1bhS/czmMHkSRO9sI51huQs="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Thunderbird/52.2.1 Lightning/5.4
Cancel-Lock: sha1:2pED8WXHuJd9KefhbSiowe9ZdQg=
Content-Language: en-GB
In-Reply-To: <slrnu4pqok.dao.t-usenet@ID-685.user.individual.de>
 by: Chris Elvidge - Sat, 29 Apr 2023 11:01 UTC

On 29/04/2023 11:01, Martin Τrautmann wrote:
> On Sun, 23 Apr 2023 09:43:06 -0400, David W. Hodgins wrote:
>> On Sun, 23 Apr 2023 07:28:22 -0400, Martin Τrautmann <t-usenet@gmx.net> wrote:
>>> That was my problem - I expected that a pipe through several sorts would
>>> keep the order. I don't know why it doesn't.
>>
>> It may be easier to understand if you use a temporary files instead of pipes.
>>
>> Sorting the input file by column 4, numerical creating a first temporary file.
>> Sort the first temporary file by column 2 creating a second temporary file.
>> Sort the second temporary file by column 3 creating the output.
>>
>> The last sort doesn't know that the prior two sorts have been done. It just
>> looks at the file it's giving and sorts it by column 3.
>>
>> Using a pipe just takes the output of the first and second sort and uses it
>> directly as input for the next sort. All the pipe does is eliminate the
>> need for a temporary file.
>
> But if I sort by one column only, then through the pipe by another
> column only, the second sort SHOULD respect the previous sort.
> Unfortunately, I feel it doesn't.

Of course it doesn't. How does the second sort know that the first sort
even happened?

>
>> Keep in mind. When sorting a file, the last line in the input may end up becoming
>> the first line in the output. The sort can not write anything to the pipe or
>> output file until it's sorted the entire input. With a pipe, the temporary
>> file is in ram rather then being a named file on disk.
>
> So the sort via a file actually should work the same as via the pipe?
>

--
Chris Elvidge
England


devel / comp.unix.shell / Re: The size of pipes (Was: sort by multiple columns)

Pages:12345
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor