Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

The only perfect science is hind-sight.


computers / news.software.nntp / Compacting CNFS buffers

SubjectAuthor
* Compacting CNFS buffersNigel Reed
`* Re: Compacting CNFS buffersJulien ÉLIE
 `* Re: Compacting CNFS buffersNigel Reed
  `- Re: Compacting CNFS buffersJulien ÉLIE

1
Compacting CNFS buffers

<20240504210746.79036ffe@wibble.sysadmininc.com>

  copy mid

https://news.novabbs.org/computers/article-flat.php?id=1482&group=news.software.nntp#1482

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder9.news.weretis.net!newsfeed.endofthelinebbs.com!.POSTED.47.186.30.161!not-for-mail
From: sysop@endofthelinebbs.com (Nigel Reed)
Newsgroups: news.software.nntp
Subject: Compacting CNFS buffers
Date: Sat, 4 May 2024 21:07:46 -0500
Organization: End Of The Line BBS
Message-ID: <20240504210746.79036ffe@wibble.sysadmininc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: www.sysadmininc.com; posting-host="47.186.30.161";
logging-data="2045923"; mail-complaints-to="abuse@endofthelinebbs.com"
X-Newsreader: Claws Mail 4.2.0git6 (GTK 3.24.33; x86_64-pc-linux-gnu)
 by: Nigel Reed - Sun, 5 May 2024 02:07 UTC

Has anyone investigated the feasibility of compacting or compressing
the cnfs buffer files?

Here's a couple of scenarios to consider, keeping in mind that
generally, articles are not expired.

1. You are sent a bunch of articles but discover you've left some
binary newsgroups in your active file. You put this groups in your
expire list and delete rmgroup but you're left with a lot of empty
space, never to be used again unless the buffer recycles.

2. You receive a bunch of googlegroup spam articles that are deleted
via NOCEM, however considering there are so many, that leaves a lot of
unused space.

If you can find where an expired article is on disk and then find the
next article, you can just move it on disk and update the pointers to
the file. This could be a process that you just kick off or,
preferably, something that runs when innd isn't fully occupied using
spare cycles or something.

I know disk space is cheap these days but some people may be limited.
It would be good not to waste space.

--
End Of The Line BBS - Plano, TX
telnet endofthelinebbs.com 23

Re: Compacting CNFS buffers

<v1cuu2$qiqi$1@news.trigofacile.com>

  copy mid

https://news.novabbs.org/computers/article-flat.php?id=1488&group=news.software.nntp#1488

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!newsfeed.bofh.team!news.trigofacile.com!.POSTED.2a01cb080adc1100498af636964831e1.ipv6.abo.wanadoo.fr!not-for-mail
From: iulius@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: Compacting CNFS buffers
Date: Tue, 7 May 2024 12:14:26 +0200
Organization: Groupes francophones par TrigoFACILE
Message-ID: <v1cuu2$qiqi$1@news.trigofacile.com>
References: <20240504210746.79036ffe@wibble.sysadmininc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 7 May 2024 10:14:26 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="2a01cb080adc1100498af636964831e1.ipv6.abo.wanadoo.fr:2a01:cb08:adc:1100:498a:f636:9648:31e1";
logging-data="871250"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:1m33/zonzFQxlE1g+u0cFl+jVNY= sha256:8wtLq0qprg6IJhrr7kZeDahP+K0axZuAFwt+dHCxk/o=
sha1:bLfl+zLcERAmu7Np0BNNO4LgJN4= sha256:Mel3FS5rb5QS8aWXl+S4kff3s9vC9en77CwIsOp+R1Y=
In-Reply-To: <20240504210746.79036ffe@wibble.sysadmininc.com>
 by: Julien ÉLIE - Tue, 7 May 2024 10:14 UTC

Hi Nigel,

> Has anyone investigated the feasibility of compacting or compressing
> the cnfs buffer files?

Some people use ZFS to compress CNFS buffers (cancelled articles are
still present though). I am not aware of a compaction feature like the
one you want.

> If you can find where an expired article is on disk and then find
> the next article, you can just move it on disk and update the
> pointers to the file. This could be a process that you just kick off
> or, preferably, something that runs when innd isn't fully occupied
> using spare cycles or something.
I understand your point; I can add it to the wish list.

FWIW, though technically this is not what you are asking for, some
mechanisms may be used to mitigate your problems:

> 1. You are sent a bunch of articles but discover you've left some
> binary newsgroups in your active file. You put this groups in your
> expire list and delete rmgroup but you're left with a lot of empty
> space, never to be used again unless the buffer recycles.

You may want to configure Cleanfeed to reject binaries (including in
binary groups) so as not to store them and waste space. Since a few
weeks, NoCeM notices have also been sent for misplaced binaries (in
non-binary groups).

> 2. You receive a bunch of googlegroup spam articles that are deleted
> via NOCEM, however considering there are so many, that leaves a lot of
> unused space.

Christoph Biedl implemented a new feature for INN 2.7.2 to store
articles by their Path header field. It is a new "path" option in
storage.conf. A typical use case is to store articles from a spammy
site in a small CNFS buffer to avoid overall retention impacts.

There's also the delayer program (in the contrib directory before INN
2.7.2) that you can use to delay articles, and give cancel control
articles and NoCeM messages time to arrive. For instance, by having a
frontend instance of innd receiving the articles from all your peers and
another local instance of innd fed by your frontend with a delay except
for cancels and NoCeM articles. The CNFS buffers of that second
instance will be spam free.
https://www.eyrie.org/~eagle/software/inn/docs/delayer.html

--
Julien ÉLIE

« Aequum est ut cuius participauit lucrum, participet et damnun. »

Re: Compacting CNFS buffers

<20240508004309.53a4572b@wibble.sysadmininc.com>

  copy mid

https://news.novabbs.org/computers/article-flat.php?id=1489&group=news.software.nntp#1489

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!newsfeed.endofthelinebbs.com!.POSTED.47.186.30.161!not-for-mail
From: sysop@endofthelinebbs.com (Nigel Reed)
Newsgroups: news.software.nntp
Subject: Re: Compacting CNFS buffers
Date: Wed, 8 May 2024 00:43:09 -0500
Organization: End Of The Line BBS
Message-ID: <20240508004309.53a4572b@wibble.sysadmininc.com>
References: <20240504210746.79036ffe@wibble.sysadmininc.com>
<v1cuu2$qiqi$1@news.trigofacile.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: www.sysadmininc.com; posting-host="47.186.30.161";
logging-data="2045923"; mail-complaints-to="abuse@endofthelinebbs.com"
X-Newsreader: Claws Mail 4.2.0git6 (GTK 3.24.33; x86_64-pc-linux-gnu)
 by: Nigel Reed - Wed, 8 May 2024 05:43 UTC

On Tue, 7 May 2024 12:14:26 +0200
Julien ÉLIE <iulius@nom-de-mon-site.com.invalid> wrote:

> Hi Nigel,
>
> > Has anyone investigated the feasibility of compacting or compressing
> > the cnfs buffer files?
>
> Some people use ZFS to compress CNFS buffers (cancelled articles are
> still present though). I am not aware of a compaction feature like
> the one you want.

I am using ZFS with CNFS and it does a good job. I also want to use the
server for other purposes so reclaiming any space would be extremely
useful.

> > If you can find where an expired article is on disk and then find
> > the next article, you can just move it on disk and update the
> > pointers to the file. This could be a process that you just kick off
> > or, preferably, something that runs when innd isn't fully occupied
> > using spare cycles or something.
> I understand your point; I can add it to the wish list.

That would be good.
>
> > 1. You are sent a bunch of articles but discover you've left some
> > binary newsgroups in your active file. You put this groups in your
> > expire list and delete rmgroup but you're left with a lot of empty
> > space, never to be used again unless the buffer recycles.
>
> You may want to configure Cleanfeed to reject binaries (including in
> binary groups) so as not to store them and waste space. Since a few
> weeks, NoCeM notices have also been sent for misplaced binaries (in
> non-binary groups).

Unfortunately the articles are already in the CFS buffers. My bad for
forgetting to remove some binary groups from the active file. I did not
have cleanfeed running when importing since it's advised to turn off
perl and python filtering.

> > 2. You receive a bunch of googlegroup spam articles that are deleted
> > via NOCEM, however considering there are so many, that leaves a lot
> > of unused space.
>
> Christoph Biedl implemented a new feature for INN 2.7.2 to store
> articles by their Path header field. It is a new "path" option in
> storage.conf. A typical use case is to store articles from a spammy
> site in a small CNFS buffer to avoid overall retention impacts.

I'll look into it, but again, the damage is already done.

>
> There's also the delayer program (in the contrib directory before INN
> 2.7.2) that you can use to delay articles, and give cancel control
> articles and NoCeM messages time to arrive. For instance, by having
> a frontend instance of innd receiving the articles from all your
> peers and another local instance of innd fed by your frontend with a
> delay except for cancels and NoCeM articles. The CNFS buffers of
> that second instance will be spam free.
> https://www.eyrie.org/~eagle/software/inn/docs/delayer.html
>

Sounds interesting but, again, I already have a lot of binary articles.
I'm not sure I want to set up a second server. I have a hard enough
time with one :)

I'll hold out hope someone with more knowledge than I also sees the
issue and decides to look into compacting CNFS buffers.

Thanks,
Nigel

--
End Of The Line BBS - Plano, TX
telnet endofthelinebbs.com 23

Re: Compacting CNFS buffers

<v1tuqi$16a20$1@news.trigofacile.com>

  copy mid

https://news.novabbs.org/computers/article-flat.php?id=1490&group=news.software.nntp#1490

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!newsfeed.bofh.team!news.trigofacile.com!.POSTED.2a01cb080adc11008461258f69102b45.ipv6.abo.wanadoo.fr!not-for-mail
From: iulius@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: Compacting CNFS buffers
Date: Mon, 13 May 2024 22:56:50 +0200
Organization: Groupes francophones par TrigoFACILE
Message-ID: <v1tuqi$16a20$1@news.trigofacile.com>
References: <20240504210746.79036ffe@wibble.sysadmininc.com>
<v1cuu2$qiqi$1@news.trigofacile.com>
<20240508004309.53a4572b@wibble.sysadmininc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 13 May 2024 20:56:50 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="2a01cb080adc11008461258f69102b45.ipv6.abo.wanadoo.fr:2a01:cb08:adc:1100:8461:258f:6910:2b45";
logging-data="1255488"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Av2UFdzQQOHdUhc68M8VFJ44bYM= sha256:L0GA/I+1438JhvgKVx1+heydqhLM90Gy12TaV/FAOZk=
sha1:odmNdLWQYcK1gwVNBblkgL1APq4= sha256:tXnoCKfDmLJ86ACsNUnW153WWgaFeY6+wokiE2sDA84=
In-Reply-To: <20240508004309.53a4572b@wibble.sysadmininc.com>
 by: Julien ÉLIE - Mon, 13 May 2024 20:56 UTC

Hi Nigel,

> I'll hold out hope someone with more knowledge than I also sees the
> issue and decides to look into compacting CNFS buffers.

It may as well be a new type of storage method, mixing the best of cnfs
and timecaf.
As far as I understand, the use case is to have large compacted buffers
without wrapping (articles do not expire but cancelled articles should
not be kept). It would correspond to timecaf except that a new CAF file
is created when it is full instead of every 256 seconds. Expiring CAF
files just compacts them if articles have been cancelled, releasing disk
space.
The feature may be implemented as an evolution of the current timecaf
method with options to parameterize it in storage.conf (like cnfs has
options). For instance with a maxart and a maxtime option to specify
the number of articles per CAF file (currently hard-coded to 262144) and
the number of seconds before creating a new CAF file (currently
hard-coded to 256 seconds but it may easily be a multiple of 256 seconds
so as to keep the current file naming). With maxtime set to 0, a new
file is created when maxart is reached.

Naturally, though it is more work, a totally new storage method could
also be created as timecaf is inherently linked to time and suffers from
the limitation that you cannot store more than maxart articles received
during maxtime seconds. They will just be dropped until a new CAF file
is created. It is not what you expect from the storage method you're
asking for. And re-using CNFS buffers may be tricky (to find and refill
holes, or to totally rewrite them - changing the storage tokens of all
articles).

--
Julien ÉLIE

« Vinum bonum laetificat cor hominis. »


computers / news.software.nntp / Compacting CNFS buffers

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor