Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

The program isn't debugged until the last user is dead.


devel / comp.unix.shell / prepending a counter for number of lines that match the first field

SubjectAuthor
* prepending a counter for number of lines that match the first fieldLloyd Houghton
`* prepending a counter for number of lines that match the firstJanis Papanagnou
 `* prepending a counter for number of lines that match the first fieldLloyd Houghton
  `- prepending a counter for number of lines that match the firstJanis Papanagnou

1
prepending a counter for number of lines that match the first field

<cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7016&group=comp.unix.shell#7016

  copy link   Newsgroups: comp.unix.shell
X-Received: by 2002:a05:6214:1764:b0:5e6:594d:fbb7 with SMTP id et4-20020a056214176400b005e6594dfbb7mr1156145qvb.4.1682749045546;
Fri, 28 Apr 2023 23:17:25 -0700 (PDT)
X-Received: by 2002:aca:c18b:0:b0:38e:bdb7:3e8f with SMTP id
r133-20020acac18b000000b0038ebdb73e8fmr1803319oif.6.1682749045208; Fri, 28
Apr 2023 23:17:25 -0700 (PDT)
Path: rocksolid2!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.unix.shell
Date: Fri, 28 Apr 2023 23:17:24 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=108.161.114.64; posting-account=Dnu1bQoAAAAf4dL0J32fQOwTjXgejjB-
NNTP-Posting-Host: 108.161.114.64
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com>
Subject: prepending a counter for number of lines that match the first field
From: lloyd.houghton@gmail.com (Lloyd Houghton)
Injection-Date: Sat, 29 Apr 2023 06:17:25 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Lloyd Houghton - Sat, 29 Apr 2023 06:17 UTC

Hi, I had a script for this purpose, from about 30 years ago which was the last time I needed it, it doesn't seem to work and I'm very rusty, and I wonder if someone could offer a solution.

I have a file where each line has two fields. The first field is sometimes identical between one line and the next. I need to prepend a new field on every line to say how many lines (including the current one) share the same first field. We can assume the file is sorted. For example, if the file is:

abc 647389
abc 12354
abd 7563
cdf 152384
cdf 8761523
cdf 1253
ghj 78654
klm 12634
pqr 9864

then when I run the script, the output should be:

2 abc 647389
2 abc 12354
1 abd 7563
3 cdf 152384
3 cdf 8761523
3 cdf 1253
1 ghj 78654
1 klm 12634
1 pqr 9864

The script that I used to do this (as best as I guess from looking in the directory with my data) looks like this:

sort -o tempid tempid
awk 'NR>1 && $1 != key { for (i=0; ++i<n) print n, line[i]; n=0 }
{ key=$1; line[++n]=$0 }
END { for (i=0; ++i<n) print n, line[i] }' tempid >tempid2

I can't say that I understand the loop specification format, or even the overall behaviour (someone must have helped me), but this script was in the directory and appears to be related to the task...

Could anyone help me to fix this?

Many many thanks.

Re: prepending a counter for number of lines that match the first field

<u2ildq$2spgf$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7017&group=comp.unix.shell#7017

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: prepending a counter for number of lines that match the first
field
Date: Sat, 29 Apr 2023 10:44:41 +0200
Organization: A noiseless patient Spider
Lines: 61
Message-ID: <u2ildq$2spgf$1@dont-email.me>
References: <cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 29 Apr 2023 08:44:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b0462b0795573b3f16a4b07c3e861212";
logging-data="3040783"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+rLkNVSxLo0T3S2j+RZhhm"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:3D6UxVfVqvJ2nL7/WNcMw4mSQI4=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com>
 by: Janis Papanagnou - Sat, 29 Apr 2023 08:44 UTC

On 29.04.2023 08:17, Lloyd Houghton wrote:
> Hi, I had a script for this purpose, from about 30 years ago which was the last time I needed it, it doesn't seem to work and I'm very rusty, and I wonder if someone could offer a solution.
>
> I have a file where each line has two fields. The first field is sometimes identical between one line and the next. I need to prepend a new field on every line to say how many lines (including the current one) share the same first field. We can assume the file is sorted. For example, if the file is:
>
> abc 647389
> abc 12354
> abd 7563
> cdf 152384
> cdf 8761523
> cdf 1253
> ghj 78654
> klm 12634
> pqr 9864
>
> then when I run the script, the output should be:
>
> 2 abc 647389
> 2 abc 12354
> 1 abd 7563
> 3 cdf 152384
> 3 cdf 8761523
> 3 cdf 1253
> 1 ghj 78654
> 1 klm 12634
> 1 pqr 9864
>
> The script that I used to do this (as best as I guess from looking in the directory with my data) looks like this:
>
> sort -o tempid tempid
> awk 'NR>1 && $1 != key { for (i=0; ++i<n) print n, line[i]; n=0 }
> { key=$1; line[++n]=$0 }
> END { for (i=0; ++i<n) print n, line[i] }' tempid >tempid2
>

This script has obvious syntactical errors.

> I can't say that I understand the loop specification format, or even the overall behaviour (someone must have helped me), but this script was in the directory and appears to be related to the task...

You need information in the lines that you can only determine by later
lines, so you need to (temporarily) store the contents of the lines as
you seem to have tried.

>
> Could anyone help me to fix this?

No, because there's a much simpler and more obvious solution; two-pass
processing across your (sorted) data.

awk '
NR==FNR { n[$1]++ ; next }
{ print n[$1], $0 }
' tempid tempid >tempid2

Janis

>
> Many many thanks.
>

Re: prepending a counter for number of lines that match the first field

<750b9820-d5a9-47eb-abd7-51fac05ff237n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7029&group=comp.unix.shell#7029

  copy link   Newsgroups: comp.unix.shell
X-Received: by 2002:a05:622a:56:b0:3f1:fb02:8331 with SMTP id y22-20020a05622a005600b003f1fb028331mr3207841qtw.9.1682805640212;
Sat, 29 Apr 2023 15:00:40 -0700 (PDT)
X-Received: by 2002:a9d:7347:0:b0:6a5:d8ff:a846 with SMTP id
l7-20020a9d7347000000b006a5d8ffa846mr2352500otk.7.1682805639848; Sat, 29 Apr
2023 15:00:39 -0700 (PDT)
Path: rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!usenet.blueworldhosting.com!diablo2.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.unix.shell
Date: Sat, 29 Apr 2023 15:00:39 -0700 (PDT)
In-Reply-To: <u2ildq$2spgf$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=108.161.114.64; posting-account=Dnu1bQoAAAAf4dL0J32fQOwTjXgejjB-
NNTP-Posting-Host: 108.161.114.64
References: <cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com> <u2ildq$2spgf$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <750b9820-d5a9-47eb-abd7-51fac05ff237n@googlegroups.com>
Subject: Re: prepending a counter for number of lines that match the first field
From: lloyd.houghton@gmail.com (Lloyd Houghton)
Injection-Date: Sat, 29 Apr 2023 22:00:40 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1800
 by: Lloyd Houghton - Sat, 29 Apr 2023 22:00 UTC

Thank you very much Janis,, this has solved my problem.

I remember your name from helping me in this same forum many years ago with a shell script. For a hobby, I end up neeing such scripts a couple of times no more than 2 or 3 times a decade, and I'm grateful to people like you who help others with problems that must seem tediously obvious to you.

regards - Lloyd

On Saturday, April 29, 2023 at 4:44:48 AM UTC-4, Janis Papanagnou wrote:

> awk '
> NR==FNR { n[$1]++ ; next }
> { print n[$1], $0 }
> ' tempid tempid >tempid2
>

Re: prepending a counter for number of lines that match the first field

<u2k5st$354t9$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=7030&group=comp.unix.shell#7030

  copy link   Newsgroups: comp.unix.shell
Path: rocksolid2!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: prepending a counter for number of lines that match the first
field
Date: Sun, 30 Apr 2023 00:31:56 +0200
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <u2k5st$354t9$1@dont-email.me>
References: <cac3bba2-3863-4ae3-a374-08f96ed9f618n@googlegroups.com>
<u2ildq$2spgf$1@dont-email.me>
<750b9820-d5a9-47eb-abd7-51fac05ff237n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 29 Apr 2023 22:31:57 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3756683e27024b8843c1056ed14a4eb5";
logging-data="3314601"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18PSNvPkYjmKrAocO5bsLCG"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:Ns29bfhNV/AED/1u0GMO+bVnD7M=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <750b9820-d5a9-47eb-abd7-51fac05ff237n@googlegroups.com>
 by: Janis Papanagnou - Sat, 29 Apr 2023 22:31 UTC

Thanks for your feedback. Glad my suggestion helped. (It's not tedious,
don't worry.)

Janis

On 30.04.2023 00:00, Lloyd Houghton wrote:
> Thank you very much Janis,, this has solved my problem.
>
> I remember your name from helping me in this same forum many years
> ago with a shell script. For a hobby, I end up neeing such scripts a
> couple of times no more than 2 or 3 times a decade, and I'm grateful
> to people like you who help others with problems that must seem
> tediously obvious to you.
>
> regards - Lloyd
>
> On Saturday, April 29, 2023 at 4:44:48 AM UTC-4, Janis Papanagnou
> wrote:
>
>> awk ' NR==FNR { n[$1]++ ; next } { print n[$1], $0 }
>> ' tempid tempid >tempid2
>>

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor