Welcome to Rocksolid Light

mail files register newsreader groups login

Message-ID:

Behind every great computer sits a skinny little geek.

Bit sequences with bounded number of continuous 1s.

Subject	Author
Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Terje Mathisen
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Stephen Fuld
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	MitchAlsup
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	BGB
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Robert Finch
Re: Bit sequences with bounded number of continuous 1s.	Chris M. Thomasson

Pages:12

Bit sequences with bounded number of continuous 1s.

<20231206123011.000031e8@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35477&group=comp.arch#35477

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 12:30:11 +0200
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="776659"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0geOIYV61mTMTQ+GJIGAzc+lMkR8a65Q="
Cancel-Lock: sha1:nZyMR0uXXy9Wrn1deCJpDVZG0Mg=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 10:30 UTC

Bit sequences with bounded number of continuous '1' bits are useful for
serialization of packetized data.

For example, if we have a way of converting arbitrary n-bit words to
unique (n+1)-bit extended words with at most L continuous leading '1'
bits, at most T continuous trailing '1' bits and at most M=L+T
continuous inner '1' bits then we can transfer multi-word packets
starting with a header consisting of M+1 '1' bits followed by a single
'0' bit and then by desired number of extended words. Any gaps between
packets are filled with '1' bits.
Receiver can easily synchronize itself to incoming packets and will
quickly and robustly recovers after media errors.
Such conversions have are other application as well. They always have.

For me, a case of n=16 (16-bit words, 17-bit extended words) is of
particular practical interest.

Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
there exist more than 2**16 extended words with desired properties.
For those interested, 76735 and 83665 extended words, respectively.
So, desired 16b-to-17b conversions certainly exist for M=4.
The problem is that I know of no simple "algorithmic" way of
implementing such conversion short of using big look-up tables. Neither
in software nor in simple compact hardware.
That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
where I know many such algorithms.

Any ideas?
May be, somebody had seen such methods in patents, preferably old and
outdated?
Or, better, in books or articles?

Re: Bit sequences with bounded number of continuous 1s.

<ukpo46$ok6v$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35478&group=comp.arch#35478

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 13:06:29 +0100
Organization: A noiseless patient Spider
Lines: 55
Message-ID: <ukpo46$ok6v$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 6 Dec 2023 12:06:30 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="77e6c77c3d37d578f2f5710c6648d90c";
logging-data="807135"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/8rcEw/DB2abi5KGHbWuASILN0D1q0C6I="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:Ya4RnC0qnN8LMehEODJdjUYjJKQ=
Content-Language: en-GB
In-Reply-To: <20231206123011.000031e8@yahoo.com>

by: David Brown - Wed, 6 Dec 2023 12:06 UTC

On 06/12/2023 11:30, Michael S wrote:
> Bit sequences with bounded number of continuous '1' bits are useful for
> serialization of packetized data.
>
> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.
>
> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.
>
> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables. Neither
> in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.
>
> Any ideas?
> May be, somebody had seen such methods in patents, preferably old and
> outdated?
> Or, better, in books or articles?
>
>

Usually the interest is in limiting length of repeats at any point in
the data, not just the start and end. After all, it's not going to help
anyone if the start and end of the word are limited to no more than 2
consecutive ones if you have a run of 10 ones in the middle of the word.

It is also common to want to combine other encoding features at the same
time. Limiting the lengths of both zeros and ones means you have
transitions, which are needed for clock recovery and synchronisation.
For many transmissions, approximately 0 DC balance is highly desirable.
And you may also want forward error correction or at least error detection.

Maybe you have some specialised use in mind?

Otherwise, these links might be of interest to you, on related ideas :

<https://en.wikipedia.org/wiki/64b/66b_encoding>
<https://en.wikipedia.org/wiki/8b/10b_encoding>
<https://en.wikipedia.org/wiki/Bit_stuffing>

Re: Bit sequences with bounded number of continuous 1s.

<20231206154145.000016a4@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35479&group=comp.arch#35479

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 15:41:45 +0200
Organization: A noiseless patient Spider
Lines: 106
Message-ID: <20231206154145.000016a4@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukpo46$ok6v$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="822309"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Vap2DEGVPpGgKmF6nIdGj6phALtZFM74="
Cancel-Lock: sha1:py9Y2KkxvCISVltazhU3J466iRk=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 13:41 UTC

On Wed, 6 Dec 2023 13:06:29 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 06/12/2023 11:30, Michael S wrote:
> > Bit sequences with bounded number of continuous '1' bits are useful
> > for serialization of packetized data.
> >
> > For example, if we have a way of converting arbitrary n-bit words to
> > unique (n+1)-bit extended words with at most L continuous leading
> > '1' bits, at most T continuous trailing '1' bits and at most M=L+T
> > continuous inner '1' bits then we can transfer multi-word packets
> > starting with a header consisting of M+1 '1' bits followed by a
> > single '0' bit and then by desired number of extended words. Any
> > gaps between packets are filled with '1' bits.
> > Receiver can easily synchronize itself to incoming packets and will
> > quickly and robustly recovers after media errors.
> > Such conversions have are other application as well. They always
> > have.
> >
> > For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> > particular practical interest.
> >
> > Brute force counting proves that both for (L=1,T=3) and for
> > (L=2,T=2) there exist more than 2**16 extended words with desired
> > properties. For those interested, 76735 and 83665 extended words,
> > respectively. So, desired 16b-to-17b conversions certainly exist
> > for M=4. The problem is that I know of no simple "algorithmic" way
> > of implementing such conversion short of using big look-up tables.
> > Neither in software nor in simple compact hardware.
> > That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> > where I know many such algorithms.
> >
> > Any ideas?
> > May be, somebody had seen such methods in patents, preferably old
> > and outdated?
> > Or, better, in books or articles?
> >
> >
>
> Usually the interest is in limiting length of repeats at any point in
> the data, not just the start and end. After all, it's not going to
> help anyone if the start and end of the word are limited to no more
> than 2 consecutive ones if you have a run of 10 ones in the middle of
> the word.

Please read again my original post. It's all covered.

>
> It is also common to want to combine other encoding features at the
> same time. Limiting the lengths of both zeros and ones means you
> have transitions, which are needed for clock recovery and
> synchronisation. For many transmissions, approximately 0 DC balance
> is highly desirable. And you may also want forward error correction
> or at least error detection.
>

In my case of interest the bits are encoded as NRZI so mere presence of
'1' bits in headers and gaps guarantees that there are at least M
transitions per packet. It is sufficient for clock phase recovery
at receiver (a.k.a Data Phase Alignment). Clock frequency recovery is
not necessary, because clocks at both sides of the link are derived from
the same source.
Anyway, when somewhat higher density of transitions is desirable, say,
one transition (i.e. logical 1) per 25 or 30 bits, it's absolutely
trivial to add at no additional coding cost since 76735 is sufficiently
large than 65736.

DC balance is not required in my case. When DC balance is required at
all, the balance requirements tend to be rather strict. Weak balance is
rarely of interest. Typically disparity should be bounded to few
percents over rather short intervals of few dozens of bit times. Or
better. In order to achieve that you need rather higher coding rates
than 17/16 or rather longer coding blocks than 16 bits. Or sometimes
both. In other words, you have to play completely different game.

> Maybe you have some specialised use in mind?
>

I do have a specialized use in mind, but for my specialized use the
schemes that limit the number of continuous '1' bits to 4 are not
necessary. Limit of 5 is perfectly o.k. and as I mentioned in original
post, it's easy. In fact, if 5 was not so easy, from practical
perspective I would be satisfied with significantly higher limit, like 7
or 8.

The question about bound of 4 is more of theoretical than of
practical interest. Yesterday it surprised me to find out that such
codes exists not just for 16b-to-17b, but for much longer words as
well, up to 30b-to-31b. But there is a big difference between knowing
that the code exists and finding a way to implement it in
environment with limited resources.

> Otherwise, these links might be of interest to you, on related ideas :
>
> <https://en.wikipedia.org/wiki/64b/66b_encoding>
> <https://en.wikipedia.org/wiki/8b/10b_encoding>
> <https://en.wikipedia.org/wiki/Bit_stuffing>
>
>

Don't you know me long enough to guess that I had read that much and
more?

Re: Bit sequences with bounded number of continuous 1s.

<5bebf6ee71be711230dc1076eaa93f28@news.novabbs.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35481&group=comp.arch#35481

copy link Newsgroups: comp.arch

Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 14:43:59 +0000
Organization: novaBBS
Message-ID: <5bebf6ee71be711230dc1076eaa93f28@news.novabbs.com>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="3246225"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$csbMEwe69xuYF1luxUlvtuxLIOJhyU/TQljhKbcS7HQUtcB7DHFDu
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949

by: MitchAlsup - Wed, 6 Dec 2023 14:43 UTC

Michael S wrote:

> Bit sequences with bounded number of continuous '1' bits are useful for
> serialization of packetized data.

> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.

> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.

> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables. Neither
> in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.

A table with 65536 entries is no longer considered "big".

> Any ideas?
> May be, somebody had seen such methods in patents, preferably old and
> outdated?
> Or, better, in books or articles?

Re: Bit sequences with bounded number of continuous 1s.

<ukq553$qnva$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35486&group=comp.arch#35486

copy link Newsgroups: comp.arch

Path: i2pn2.org!rocksolid2!news.neodome.net!news.nntp4.net!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 16:48:51 +0100
Organization: A noiseless patient Spider
Lines: 189
Message-ID: <ukq553$qnva$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me>
<20231206154145.000016a4@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 6 Dec 2023 15:48:51 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="77e6c77c3d37d578f2f5710c6648d90c";
logging-data="876522"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Ax/JmbJtZ/mXifWsa0J2gtyFD5ci4Bk0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:hqP9KpMrGhspL938IBHuLoc3z0o=
In-Reply-To: <20231206154145.000016a4@yahoo.com>
Content-Language: en-GB

by: David Brown - Wed, 6 Dec 2023 15:48 UTC

On 06/12/2023 14:41, Michael S wrote:
> On Wed, 6 Dec 2023 13:06:29 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 06/12/2023 11:30, Michael S wrote:
>>> Bit sequences with bounded number of continuous '1' bits are useful
>>> for serialization of packetized data.
>>>
>>> For example, if we have a way of converting arbitrary n-bit words to
>>> unique (n+1)-bit extended words with at most L continuous leading
>>> '1' bits, at most T continuous trailing '1' bits and at most M=L+T
>>> continuous inner '1' bits then we can transfer multi-word packets
>>> starting with a header consisting of M+1 '1' bits followed by a
>>> single '0' bit and then by desired number of extended words. Any
>>> gaps between packets are filled with '1' bits.
>>> Receiver can easily synchronize itself to incoming packets and will
>>> quickly and robustly recovers after media errors.
>>> Such conversions have are other application as well. They always
>>> have.
>>>
>>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
>>> particular practical interest.
>>>
>>> Brute force counting proves that both for (L=1,T=3) and for
>>> (L=2,T=2) there exist more than 2**16 extended words with desired
>>> properties. For those interested, 76735 and 83665 extended words,
>>> respectively. So, desired 16b-to-17b conversions certainly exist
>>> for M=4. The problem is that I know of no simple "algorithmic" way
>>> of implementing such conversion short of using big look-up tables.
>>> Neither in software nor in simple compact hardware.
>>> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
>>> where I know many such algorithms.
>>>
>>> Any ideas?
>>> May be, somebody had seen such methods in patents, preferably old
>>> and outdated?
>>> Or, better, in books or articles?
>>>
>>>
>>
>> Usually the interest is in limiting length of repeats at any point in
>> the data, not just the start and end. After all, it's not going to
>> help anyone if the start and end of the word are limited to no more
>> than 2 consecutive ones if you have a run of 10 ones in the middle of
>> the word.
>
> Please read again my original post. It's all covered.

Not as far as I can see. Maybe I have misunderstood you, or maybe you
have not understood me. Correct me if I am wrong below.

For L = 2, T = 2 (as an example), your coding would not allow

1100 1110 0101 1011 1

because it ends in 3 ones. (The spaces are just to make it easier to
read.) Your aim is to avoid the join of two packets having more than 4
ones in a row, so that if the receive sees 5 or more ones in a row, it
knows it is an inter-packet idle period.

But you /would/ allow :

1000 1111 1111 1000 1

This has no more than 1 leading or trailing one. But it has 8 ones in a
row. A receiver that is trying to synchronise would see a run of 8 ones
and interpret it as an inter-packet idle gap.

Additionally, how is the receiver supposed to identify the number of
ones at the start of a packet, after a gap? If it sees a sequence of 10
ones, then a zero (and then other values), how does it know if the
packet is starting with 0, 10, or 110 ?

Synchronisation for the start of a frame in communication is always done
with specific patterns containing at least one transition. For example,
with good-old-fashioned UART communication, break (inter-character
pauses) are a string of at least 10 ones, and the start of a character
frame is always a zero with the line going from high to low (either
after a break, or after the previous character's stop bit).

>
>>
>> It is also common to want to combine other encoding features at the
>> same time. Limiting the lengths of both zeros and ones means you
>> have transitions, which are needed for clock recovery and
>> synchronisation. For many transmissions, approximately 0 DC balance
>> is highly desirable. And you may also want forward error correction
>> or at least error detection.
>>
>
> In my case of interest the bits are encoded as NRZI so mere presence of
> '1' bits in headers and gaps guarantees that there are at least M
> transitions per packet.

OK, but you still have to make sure there are a suitable minimum number
of ones per frame. As it stands, a run of 17 zeros would fit your
requirements.

> It is sufficient for clock phase recovery
> at receiver (a.k.a Data Phase Alignment). Clock frequency recovery is
> not necessary, because clocks at both sides of the link are derived from
> the same source.

Fair enough. It is convenient when you have that!

> Anyway, when somewhat higher density of transitions is desirable, say,
> one transition (i.e. logical 1) per 25 or 30 bits, it's absolutely
> trivial to add at no additional coding cost since 76735 is sufficiently
> large than 65736.

Certainly you can make sure you have enough ones in the packet. (I
would not say it is "trivial" until you have found a simple algorithm
for coding and decoding to match your requirements - but it is unlikely
to make the problem any harder.)

>
> DC balance is not required in my case. When DC balance is required at
> all, the balance requirements tend to be rather strict. Weak balance is
> rarely of interest.

Weak balance - a statistical or probabilistic balance - is certainly of
interest in some systems. PCIe communication, for instance, maintains
an approximate balance by using a scrambler. Some DC "leakage" through
the terminators is enough to deal with the remainder, if I have
understood things correctly.

Other systems use active DC balance tracking to keep a stricter control,
and transmit different encodings to minimise the DC imbalance. But even
then, it is never perfect - no line drivers are equally strong for high
and low drives, and there are always differences in the latencies for
high and low transitions. Even with an ideal 50% duty cycle coming in,
there will be a small DC bias that must be leaked away. So DC balancing
in the encoding is /always/ weak, if you look closely enough.

> Typically disparity should be bounded to few
> percents over rather short intervals of few dozens of bit times. Or
> better. In order to achieve that you need rather higher coding rates
> than 17/16 or rather longer coding blocks than 16 bits. Or sometimes
> both. In other words, you have to play completely different game.
>
>> Maybe you have some specialised use in mind?
>>
>
> I do have a specialized use in mind, but for my specialized use the
> schemes that limit the number of continuous '1' bits to 4 are not
> necessary. Limit of 5 is perfectly o.k. and as I mentioned in original
> post, it's easy. In fact, if 5 was not so easy, from practical
> perspective I would be satisfied with significantly higher limit, like 7
> or 8.
>
> The question about bound of 4 is more of theoretical than of
> practical interest. Yesterday it surprised me to find out that such
> codes exists not just for 16b-to-17b, but for much longer words as
> well, up to 30b-to-31b. But there is a big difference between knowing
> that the code exists and finding a way to implement it in
> environment with limited resources.
>
>> Otherwise, these links might be of interest to you, on related ideas :
>>
>> <https://en.wikipedia.org/wiki/64b/66b_encoding>
>> <https://en.wikipedia.org/wiki/8b/10b_encoding>
>> <https://en.wikipedia.org/wiki/Bit_stuffing>
>>
>>
>
> Don't you know me long enough to guess that I had read that much and
> more?
>

I try to answer questions as I see them, rather than attempting to keep
track of what everyone knows.

When I see a post that looks like someone is trying to invent a new
technique in an existing field, and their ideas are noticeably different
from standard solutions, there are various possible explanations. One
is that they have a very niche use-case so the standard ideas don't
suit. Another is that they have come up with something truly original
and exciting. Most common, however, is that they don't know much about
the field and are not very aware of existing methods, and are currently
just playing about with ideas for solutions. Option 3 is the first
assumption until demonstrated otherwise. It looks like option 1 applies
to you here.

I can't say that I really understand why you have the requirements you
are asking for here - they don't make sense to me. Obviously I don't
have a full view of your system, but I cannot currently imagine a system
for which your requirements would be the best way to solve the problem.

Click here to read the complete article

Re: Bit sequences with bounded number of continuous 1s.

<20231206182638.00006845@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35487&group=comp.arch#35487

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 18:26:38 +0200
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <20231206182638.00006845@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<5bebf6ee71be711230dc1076eaa93f28@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19b/OZBSsvRkcJRqoRTfyG2zTVZ67XpUN8="
Cancel-Lock: sha1:06RXAY9PvjCKw+0x5/gvisE16c4=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 16:26 UTC

On Wed, 6 Dec 2023 14:43:59 +0000
mitchalsup@aol.com (MitchAlsup) wrote:

> Michael S wrote:
>
> > Bit sequences with bounded number of continuous '1' bits are useful
> > for serialization of packetized data.
>
> > For example, if we have a way of converting arbitrary n-bit words to
> > unique (n+1)-bit extended words with at most L continuous leading
> > '1' bits, at most T continuous trailing '1' bits and at most M=L+T
> > continuous inner '1' bits then we can transfer multi-word packets
> > starting with a header consisting of M+1 '1' bits followed by a
> > single '0' bit and then by desired number of extended words. Any
> > gaps between packets are filled with '1' bits.
> > Receiver can easily synchronize itself to incoming packets and will
> > quickly and robustly recovers after media errors.
> > Such conversions have are other application as well. They always
> > have.
>
> > For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> > particular practical interest.
>
> > Brute force counting proves that both for (L=1,T=3) and for
> > (L=2,T=2) there exist more than 2**16 extended words with desired
> > properties. For those interested, 76735 and 83665 extended words,
> > respectively. So, desired 16b-to-17b conversions certainly exist
> > for M=4. The problem is that I know of no simple "algorithmic" way
> > of implementing such conversion short of using big look-up tables.
> > Neither in software nor in simple compact hardware.
> > That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> > where I know many such algorithms.
>
> A table with 65536 entries is no longer considered "big".
>

"Compact hardware" in this case means ~50 LEs on each side of the link.
LEs of sort we have in old-fashioned FPGA where each LE contains 1
register, 1 4-input LUT, carry-in/carry-out for adder and little else.
Use of one or two 8 Kbit SRAM blocks is acceptable on transmitter side,
but not on receiver side.

Pay attention that it's acceptable and even expected for implementation
to be serial i.e. transmitting/receiving one bit per clock and
completing the full transform every 17 clock cycles.

Re: Bit sequences with bounded number of continuous 1s.

<868r67w9fq.fsf@linuxsc.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35488&group=comp.arch#35488

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.niel.me!news.nntp4.net!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 06 Dec 2023 09:02:49 -0800
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <868r67w9fq.fsf@linuxsc.com>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="34015b29ea968070bbd06e204776be53";
logging-data="903949"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19uZgojwVxLa7eChobQCWc6BVlC++TMNWU="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:/q4eH+QzzgE7fekZvr8ly5zGZUY=
sha1:lwXHwM/WIRvlC3u1p5xh90LTRSs=

by: Tim Rentsch - Wed, 6 Dec 2023 17:02 UTC

Michael S <already5chosen@yahoo.com> writes:

> Bit sequences with bounded number of continuous '1' bits are useful for
> serialization of packetized data.
>
> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.
>
> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.
>
> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables. Neither
> in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.
>
> Any ideas?

Please excuse my not following.. I don't quite see what you're
getting at here. I get that you want to transform an arbitrary
16-bit value into a 17-bit value that satisfies a certain
property (such that the 16-bit value can be reconstructed), but I
don't get what the property is exactly or how a 17-bit value is
transformed back into a 16-bit value. I do realize that what
you're looking for is a nice way of turning a 16-bit value into a
17-bit value (assuming I'm not completely confused, which is
always a possibility). Can you spell it out for me in a bit more
detail?

Re: Bit sequences with bounded number of continuous 1s.

<20231206194307.00003a7a@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35490&group=comp.arch#35490

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.chmurka.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 19:43:07 +0200
Organization: A noiseless patient Spider
Lines: 254
Message-ID: <20231206194307.00003a7a@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukpo46$ok6v$1@dont-email.me>
<20231206154145.000016a4@yahoo.com>
<ukq553$qnva$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/L8bvTL5t0loizAslNcgFOPGnZ9MrkyX4="
Cancel-Lock: sha1:I7PFEEP1b+iRK8wtqZqyd6IJZmk=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 17:43 UTC

On Wed, 6 Dec 2023 16:48:51 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 06/12/2023 14:41, Michael S wrote:
> > On Wed, 6 Dec 2023 13:06:29 +0100
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 06/12/2023 11:30, Michael S wrote:
> >>> Bit sequences with bounded number of continuous '1' bits are
> >>> useful for serialization of packetized data.
> >>>
> >>> For example, if we have a way of converting arbitrary n-bit words
> >>> to unique (n+1)-bit extended words with at most L continuous
> >>> leading '1' bits, at most T continuous trailing '1' bits and at
> >>> most M=L+T continuous inner '1' bits then we can transfer
> >>> multi-word packets starting with a header consisting of M+1 '1'
> >>> bits followed by a single '0' bit and then by desired number of
> >>> extended words. Any gaps between packets are filled with '1' bits.
> >>> Receiver can easily synchronize itself to incoming packets and
> >>> will quickly and robustly recovers after media errors.
> >>> Such conversions have are other application as well. They always
> >>> have.
> >>>
> >>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> >>> particular practical interest.
> >>>
> >>> Brute force counting proves that both for (L=1,T=3) and for
> >>> (L=2,T=2) there exist more than 2**16 extended words with desired
> >>> properties. For those interested, 76735 and 83665 extended words,
> >>> respectively. So, desired 16b-to-17b conversions certainly exist
> >>> for M=4. The problem is that I know of no simple "algorithmic" way
> >>> of implementing such conversion short of using big look-up tables.
> >>> Neither in software nor in simple compact hardware.
> >>> That is in strong contrast to case of M=5 (both L=1,T=4 and
> >>> L=2,T=3) where I know many such algorithms.
> >>>
> >>> Any ideas?
> >>> May be, somebody had seen such methods in patents, preferably old
> >>> and outdated?
> >>> Or, better, in books or articles?
> >>>
> >>>
> >>
> >> Usually the interest is in limiting length of repeats at any point
> >> in the data, not just the start and end. After all, it's not
> >> going to help anyone if the start and end of the word are limited
> >> to no more than 2 consecutive ones if you have a run of 10 ones in
> >> the middle of the word.
> >
> > Please read again my original post. It's all covered.
>
> Not as far as I can see. Maybe I have misunderstood you, or maybe
> you have not understood me. Correct me if I am wrong below.
>
> For L = 2, T = 2 (as an example), your coding would not allow
>
> 1100 1110 0101 1011 1
>
> because it ends in 3 ones. (The spaces are just to make it easier to
> read.) Your aim is to avoid the join of two packets having more than
> 4 ones in a row, so that if the receive sees 5 or more ones in a row,
> it knows it is an inter-packet idle period.
>
> But you /would/ allow :
>
> 1000 1111 1111 1000 1
>
> This has no more than 1 leading or trailing one. But it has 8 ones
> in a row. A receiver that is trying to synchronise would see a run
> of 8 ones and interpret it as an inter-packet idle gap.
>
> Additionally, how is the receiver supposed to identify the number of
> ones at the start of a packet, after a gap? If it sees a sequence of
> 10 ones, then a zero (and then other values), how does it know if the
> packet is starting with 0, 10, or 110 ?
>
> Synchronisation for the start of a frame in communication is always
> done with specific patterns containing at least one transition. For
> example, with good-old-fashioned UART communication, break
> (inter-character pauses) are a string of at least 10 ones, and the
> start of a character frame is always a zero with the line going from
> high to low (either after a break, or after the previous character's
> stop bit).
>

You still refused to read my original post without skipping words.
In particular you obviously did not read this part "and at most M=L+T
continuous inner '1' bits".

> >
> >>
> >> It is also common to want to combine other encoding features at the
> >> same time. Limiting the lengths of both zeros and ones means you
> >> have transitions, which are needed for clock recovery and
> >> synchronisation. For many transmissions, approximately 0 DC balance
> >> is highly desirable. And you may also want forward error correction
> >> or at least error detection.
> >>
> >
> > In my case of interest the bits are encoded as NRZI so mere
> > presence of '1' bits in headers and gaps guarantees that there are
> > at least M transitions per packet.
>
> OK, but you still have to make sure there are a suitable minimum
> number of ones per frame. As it stands, a run of 17 zeros would fit
> your requirements.
>
> > It is sufficient for clock phase recovery
> > at receiver (a.k.a Data Phase Alignment). Clock frequency recovery
> > is not necessary, because clocks at both sides of the link are
> > derived from the same source.
>
> Fair enough. It is convenient when you have that!
>
> > Anyway, when somewhat higher density of transitions is desirable,
> > say, one transition (i.e. logical 1) per 25 or 30 bits, it's
> > absolutely trivial to add at no additional coding cost since 76735
> > is sufficiently large than 65736.
>
> Certainly you can make sure you have enough ones in the packet. (I
> would not say it is "trivial" until you have found a simple algorithm
> for coding and decoding to match your requirements - but it is
> unlikely to make the problem any harder.)
>

In the mean time I found an algorithm that works. Sort of.
Modification that causes it to generate no more than 5 leading zeros
and no more than 16 trailing zeros per extended word is indeed trivial.

I am not fully satisfied with this algorithm for two reasons:
A. It's sort of ad hoc. It works for 16->17, but I am not sure that
similar algorithms are going to work for, say 24->25 or 28->29.
B. The hardware implementation (in FPGA) is not as compact as I would
like. There are too many non-trivial constants involved and non-trivial
constants consume LEs.

> >
> > DC balance is not required in my case. When DC balance is required
> > at all, the balance requirements tend to be rather strict. Weak
> > balance is rarely of interest.
>
> Weak balance - a statistical or probabilistic balance - is certainly
> of interest in some systems. PCIe communication, for instance,
> maintains an approximate balance by using a scrambler. Some DC
> "leakage" through the terminators is enough to deal with the
> remainder, if I have understood things correctly.
>

PCIe Gen1 and Gen2 apply 8b-10b encoding that guarantees rather strong
DC balance. As observed on 10T intervals, disparity never exceeds two
bits. It is possible due too high coding rate of 5/4.
For later PCIe generations DC balance is indeed non-robust. I suspect
that hardware here is designed to transmit/receive totally non-balanced
streams. With more balanced streams it just works better == has higher
noise margin.
But I admit to not learning newer PCIe phys to the same degree I
learned earlier generations.

> Other systems use active DC balance tracking to keep a stricter
> control, and transmit different encodings to minimise the DC
> imbalance. But even then, it is never perfect - no line drivers are
> equally strong for high and low drives, and there are always
> differences in the latencies for high and low transitions. Even with
> an ideal 50% duty cycle coming in, there will be a small DC bias that
> must be leaked away. So DC balancing in the encoding is /always/
> weak, if you look closely enough.
>

N/A for differential, which happens to dominate high speed outside of
DRAM buses.

> > Typically disparity should be bounded to few
> > percents over rather short intervals of few dozens of bit times. Or
> > better. In order to achieve that you need rather higher coding rates
> > than 17/16 or rather longer coding blocks than 16 bits. Or sometimes
> > both. In other words, you have to play completely different game.
> >
> >> Maybe you have some specialised use in mind?
> >>
> >
> > I do have a specialized use in mind, but for my specialized use the
> > schemes that limit the number of continuous '1' bits to 4 are not
> > necessary. Limit of 5 is perfectly o.k. and as I mentioned in
> > original post, it's easy. In fact, if 5 was not so easy, from
> > practical perspective I would be satisfied with significantly
> > higher limit, like 7 or 8.
> >
> > The question about bound of 4 is more of theoretical than of
> > practical interest. Yesterday it surprised me to find out that such
> > codes exists not just for 16b-to-17b, but for much longer words as
> > well, up to 30b-to-31b. But there is a big difference between
> > knowing that the code exists and finding a way to implement it in
> > environment with limited resources.
> >
> >> Otherwise, these links might be of interest to you, on related
> >> ideas :
> >>
> >> <https://en.wikipedia.org/wiki/64b/66b_encoding>
> >> <https://en.wikipedia.org/wiki/8b/10b_encoding>
> >> <https://en.wikipedia.org/wiki/Bit_stuffing>
> >>
> >>
> >
> > Don't you know me long enough to guess that I had read that much and
> > more?
> >
>
> I try to answer questions as I see them, rather than attempting to
> keep track of what everyone knows.
>
> When I see a post that looks like someone is trying to invent a new
> technique in an existing field, and their ideas are noticeably
> different from standard solutions, there are various possible
> explanations. One is that they have a very niche use-case so the
> standard ideas don't suit. Another is that they have come up with
> something truly original and exciting. Most common, however, is that
> they don't know much about the field and are not very aware of
> existing methods, and are currently just playing about with ideas for
> solutions. Option 3 is the first assumption until demonstrated
> otherwise. It looks like option 1 applies to you here.
>
> I can't say that I really understand why you have the requirements
> you are asking for here - they don't make sense to me. Obviously I
> don't have a full view of your system, but I cannot currently imagine
> a system for which your requirements would be the best way to solve
> the problem.
>
>

Click here to read the complete article

Re: Bit sequences with bounded number of continuous 1s.

<20231206200105.00005b68@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35493&group=comp.arch#35493

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 20:01:05 +0200
Organization: A noiseless patient Spider
Lines: 64
Message-ID: <20231206200105.00005b68@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<868r67w9fq.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+xW4zmrHh3EdO4l0zSGqlEuYWO3c+E+aA="
Cancel-Lock: sha1:wzZAt83Yv/T0tcdWI1YkRFyFs1k=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 18:01 UTC

On Wed, 06 Dec 2023 09:02:49 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

> Michael S <already5chosen@yahoo.com> writes:
>
> > Bit sequences with bounded number of continuous '1' bits are useful
> > for serialization of packetized data.
> >
> > For example, if we have a way of converting arbitrary n-bit words to
> > unique (n+1)-bit extended words with at most L continuous leading
> > '1' bits, at most T continuous trailing '1' bits and at most M=L+T
> > continuous inner '1' bits then we can transfer multi-word packets
> > starting with a header consisting of M+1 '1' bits followed by a
> > single '0' bit and then by desired number of extended words. Any
> > gaps between packets are filled with '1' bits.
> > Receiver can easily synchronize itself to incoming packets and will
> > quickly and robustly recovers after media errors.
> > Such conversions have are other application as well. They always
> > have.
> >
> > For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> > particular practical interest.
> >
> > Brute force counting proves that both for (L=1,T=3) and for
> > (L=2,T=2) there exist more than 2**16 extended words with desired
> > properties. For those interested, 76735 and 83665 extended words,
> > respectively. So, desired 16b-to-17b conversions certainly exist
> > for M=4. The problem is that I know of no simple "algorithmic" way
> > of implementing such conversion short of using big look-up tables.
> > Neither in software nor in simple compact hardware.
> > That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> > where I know many such algorithms.
> >
> > Any ideas?
>
> Please excuse my not following.. I don't quite see what you're
> getting at here. I get that you want to transform an arbitrary
> 16-bit value into a 17-bit value that satisfies a certain
> property (such that the 16-bit value can be reconstructed), but I
> don't get what the property is exactly or how a 17-bit value is
> transformed back into a 16-bit value. I do realize that what
> you're looking for is a nice way of turning a 16-bit value into a
> 17-bit value (assuming I'm not completely confused, which is
> always a possibility). Can you spell it out for me in a bit more
> detail?

I thought it was rather obvious but rather obviously it was not :(
So let's do it again.

1. Reversibility.
No two 16-bit input patterns correspond to same 17-bit output pattern.
2. Leading '1' bits.
Output patterns that have L+1 or more consecutive '1' MS bits are
illegal.
3. Trailing '1' bits.
Output patterns that have T+1 or more consecutive '1' LS bits are
illegal.
4. Inner '1' bits.
Output patterns that have L+T+1 or more consecutive '1' bits starting
at any position are illegal.

(L,T) = either (1,3) or (2,2).

Re: Bit sequences with bounded number of continuous 1s.

<20231206202236.00005bfe@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35494&group=comp.arch#35494

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 20:22:36 +0200
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <20231206202236.00005bfe@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="9a57b320616a2c9027229db4a7d3f825";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18dTojnp3LFEBfN4A2U+1IRI43Q/WIeCQk="
Cancel-Lock: sha1:5ymzxJAK8H4gUAGRPnupXPiEbvs=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Wed, 6 Dec 2023 18:22 UTC

On Wed, 6 Dec 2023 12:30:11 +0200
Michael S <already5chosen@yahoo.com> wrote:

> Bit sequences with bounded number of continuous '1' bits are useful
> for serialization of packetized data.
>
> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.
>
> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.
>
> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables.
> Neither in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.
>
> Any ideas?
> May be, somebody had seen such methods in patents, preferably old and
> outdated?
> Or, better, in books or articles?
>
>
>

Here is my current algorithm.
As mentioned in other post, it is unsatisfactory for two reasons:
A. It's sort of ad hoc. It works for 16->17, but I am not sure that
similar algorithms are going to work for, say 24->25 or 28->29.
B. The hardware implementation (in FPGA) is not as compact as I would
like. There are too many non-trivial constants involved. Non-trivial
constants consume LEs.

static const unsigned bv_tab[17] = { // bv_tab[k]/bv_tab[k+i] <= 2
42557,
21647,
11011,
5601,
2849,
1449,
737,
375,
191,
97,
49,
25,
13,
7,
4,
2,
1,
};

// encode 16->17
unsigned enc(unsigned x)
{ // avoid long strings of zeros by mapping inputs in range
// [0:2**13-1] to [2**16:2**16+2**13-1]
if (x < 8*1024)
x += 0x10000;

// convert x to sub-binary non-constant base
unsigned y = 0;
for (int i = 0; i < 17; ++i) {
y += y;
const unsigned bv = bv_tab[i];
if (x >= bv) {
x -= bv;
y += 1;
}
}
return y & 0x1FFFF;
}

// decode 17->16
unsigned dec(unsigned x)
{ // convert x from sub-binary non-constant base back to binary
unsigned y = 0;
for (int i = 0; i < 17; ++i) {
if (x & 0x10000) {
y += bv_tab[i];
}
x += x;
}
return y & 0xFFFF;
}

Re: Bit sequences with bounded number of continuous 1s.

<ukqnfb$tjh5$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35501&group=comp.arch#35501

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 22:01:31 +0100
Organization: A noiseless patient Spider
Lines: 70
Message-ID: <ukqnfb$tjh5$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Injection-Date: Wed, 6 Dec 2023 21:01:31 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b6e99571b94933695ff5fba82b4bdd6e";
logging-data="970277"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/fgXaMgXl/zNswUqmuoMU2pLKqS5K+zZANPJPbGhGyJA=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17.1
Cancel-Lock: sha1:AwlIRhPNU7zAyguqvhCO5YpYtvA=
In-Reply-To: <ukpo46$ok6v$1@dont-email.me>

by: Terje Mathisen - Wed, 6 Dec 2023 21:01 UTC

David Brown wrote:
> On 06/12/2023 11:30, Michael S wrote:
>> Bit sequences with bounded number of continuous '1' bits are useful for
>> serialization of packetized data.
>>
>> For example, if we have a way of converting arbitrary n-bit words to
>> unique (n+1)-bit extended words with at most L continuous leading '1'
>> bits, at most T continuous trailing '1' bits and at most M=L+T
>> continuous inner '1' bits then we can transfer multi-word packets
>> starting with a header consisting of M+1 '1' bits followed by a single
>> '0' bit and then by desired number of extended words. Any gaps between
>> packets are filled with '1' bits.
>> Receiver can easily synchronize itself to incoming packets and will
>> quickly and robustly recovers after media errors.
>> Such conversions have are other application as well. They always have.
>>
>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
>> particular practical interest.
>>
>> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
>> there exist more than 2**16 extended words with desired properties.
>> For those interested, 76735 and 83665 extended words, respectively.
>> So, desired 16b-to-17b conversions certainly exist for M=4.
>> The problem is that I know of no simple "algorithmic" way of
>> implementing such conversion short of using big look-up tables. Neither
>> in software nor in simple compact hardware.
>> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
>> where I know many such algorithms.
>>
>> Any ideas?
>> May be, somebody had seen such methods in patents, preferably old and
>> outdated?
>> Or, better, in books or articles?
>>
>>
>
> Usually the interest is in limiting length of repeats at any point in
> the data, not just the start and end. After all, it's not going to help
> anyone if the start and end of the word are limited to no more than 2
> consecutive ones if you have a run of 10 ones in the middle of the word.
>
> It is also common to want to combine other encoding features at the same
> time. Limiting the lengths of both zeros and ones means you have
> transitions, which are needed for clock recovery and synchronisation.
> For many transmissions, approximately 0 DC balance is highly desirable.
> And you may also want forward error correction or at least error detection.

AFAIR an 8->10 encoding is used for all gigabit ethernet links, for
exactly those purposes.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Bit sequences with bounded number of continuous 1s.

<ukqru8$ueqe$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35507&group=comp.arch#35507

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 16:17:41 -0600
Organization: A noiseless patient Spider
Lines: 135
Message-ID: <ukqru8$ueqe$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 6 Dec 2023 22:17:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="325390cd5f4a9d68788313ed1525711c";
logging-data="998222"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SRLD6jnA730f4dc+Zrwi2"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tNtWXz3tDopLDnpq5Utd5EiF+bg=
Content-Language: en-US
In-Reply-To: <20231206123011.000031e8@yahoo.com>

by: BGB - Wed, 6 Dec 2023 22:17 UTC

On 12/6/2023 4:30 AM, Michael S wrote:
> Bit sequences with bounded number of continuous '1' bits are useful for
> serialization of packetized data.
>
> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.
>
> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.
>
> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables. Neither
> in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.
>
> Any ideas?
> May be, somebody had seen such methods in patents, preferably old and
> outdated?
> Or, better, in books or articles?
>

For hardware, 8b/10b looks like a more sane option.

Or, maybe cheaper, could be a 4b to 5b mapping.

Say, 4 to 5 bit:
0000 -> 01010
0001 -> 01001
0010 -> 01110
0011 -> 01101
0101 -> 00001
0110 -> 00110
0111 -> 00101

1000 -> 11010
1001 -> 11001
1010 -> 11110
1011 -> 11100
1101 -> 10001
1110 -> 10110
1111 -> 10101

Better schemes are likely possible, this was basically "input^0101" with
the last bit as a "balancing bit".

Could encode two such values for 8 bits, or 4 for 16 bits.

This could be reduced to 16->18 or 16->17 by as selecting an additional
XOR mask for the other bits, so:
(bits^0x5555)^(bal?MASK1:MASK2);
(bits^0x5555)^(bal1?(bal0?MASK1:MASK2):(bal0?MASK3:MASK4));

This would avoid needing giant lookup tables, and with appropriate masks
could remain roughly DC-biased (though, a 16->18 bit version, with 4
possible masks, count be better for DC balancing).

Say, XOR masks (2b):
00: 111011101110
01: 011001100110
10: 100110011001
11: 000100010001

One other possible feature that could be investigated (more likely if
using the 16 bits via 20 bits scheme) would be to tweak the encoding to
make it easier to re-synchronize the bitstream to an 8 or 16-bit
boundary if the link is disrupted for whatever reason.

Likely easiest way to do this would be to use an 0x55AA mask vs 0x5555
(for 16-bit alignment), or 0x5A5A for 8-bit alignment, with the
balancing bits tweaked such that the two cases are (mostly) mutually
exclusive.

By looking at a chunk of, say, 20 or 40 bits, it would be possible to
estimate within a reasonable probability whether or not the bitstream is
correctly aligned.

Say, 4 to 5 bit (5 and A):
0000 -> 01010, 10100
0001 -> 01001, 10110
0010 -> 01110, 10000
0011 -> 01101, 10011
0101 -> 00001, 11110
0110 -> 00110, 11000
0111 -> 00101, 11011
1000 -> 11010, 00100
1001 -> 11001, 00111
1010 -> 11110, 00001
1011 -> 11100, 00011
1101 -> 10001, 01111
1110 -> 10110, 01001
1111 -> 10101, 01011

Or, inverses:
00000: Invalid
00001: 0101(5), 1010(A)
00010: Invalid
00011: 1011(A), Invalid(5)
00100: 1000(A), Invalid(5)
00101: 0111(5), Invalid(A)
00110: 0110(5), Invalid(A)
00111: 1001(A), Invalid(5)
01000: Invalid
01001: 0001(5), 1110(A)
... (Could fill out the rest, but should be obvious enough).
11111: Invalid

Better is likely possible, but this was just what came to mind at the
moment.

Would prefer to keep away from the NRZI/Bit-Stuffing scheme, as this
"kinda sucks" IMHO (bad enough experiences with this trying to implement
hardware interfaces for USB).

....

Re: Bit sequences with bounded number of continuous 1s.

<20231207021135.00006e20@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35510&group=comp.arch#35510

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 02:11:35 +0200
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <20231207021135.00006e20@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukpo46$ok6v$1@dont-email.me>
<ukqnfb$tjh5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Injection-Info: dont-email.me; posting-host="8dd00fbefe4b0df98890fe3c8bf3a765";
logging-data="1023828"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX199R5td9YifF3rHbWbXOc7/sxrTMbZk7ps="
Cancel-Lock: sha1:cABWlogXisr3e8m9m6yFg2Ocy9s=
X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32)

by: Michael S - Thu, 7 Dec 2023 00:11 UTC

On Wed, 6 Dec 2023 22:01:31 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

> David Brown wrote:
> > On 06/12/2023 11:30, Michael S wrote:
> >> Bit sequences with bounded number of continuous '1' bits are
> >> useful for serialization of packetized data.
> >>
> >> For example, if we have a way of converting arbitrary n-bit words
> >> to unique (n+1)-bit extended words with at most L continuous
> >> leading '1' bits, at most T continuous trailing '1' bits and at
> >> most M=L+T continuous inner '1' bits then we can transfer
> >> multi-word packets starting with a header consisting of M+1 '1'
> >> bits followed by a single '0' bit and then by desired number of
> >> extended words. Any gaps between packets are filled with '1' bits.
> >> Receiver can easily synchronize itself to incoming packets and will
> >> quickly and robustly recovers after media errors.
> >> Such conversions have are other application as well. They always
> >> have.
> >>
> >> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> >> particular practical interest.
> >>
> >> Brute force counting proves that both for (L=1,T=3) and for
> >> (L=2,T=2) there exist more than 2**16 extended words with desired
> >> properties. For those interested, 76735 and 83665 extended words,
> >> respectively. So, desired 16b-to-17b conversions certainly exist
> >> for M=4. The problem is that I know of no simple "algorithmic" way
> >> of implementing such conversion short of using big look-up tables.
> >> Neither in software nor in simple compact hardware.
> >> That is in strong contrast to case of M=5 (both L=1,T=4 and
> >> L=2,T=3) where I know many such algorithms.
> >>
> >> Any ideas?
> >> May be, somebody had seen such methods in patents, preferably old
> >> and outdated?
> >> Or, better, in books or articles?
> >>
> >>
> >
> > Usually the interest is in limiting length of repeats at any point
> > in the data, not just the start and end. After all, it's not going
> > to help anyone if the start and end of the word are limited to no
> > more than 2 consecutive ones if you have a run of 10 ones in the
> > middle of the word.
> >
> > It is also common to want to combine other encoding features at the
> > same time. Limiting the lengths of both zeros and ones means you
> > have transitions, which are needed for clock recovery and
> > synchronisation. For many transmissions, approximately 0 DC balance
> > is highly desirable. And you may also want forward error correction
> > or at least error detection.
>
> AFAIR an 8->10 encoding is used for all gigabit ethernet links, for
> exactly those purposes.
>
> Terje
>
>

In 1GbE world 8B10B is used only for one of several available variants
of MAC-to-PHY interface (SGMII) and by optical media (all sub-variants
of 1000BaseX).

The dominant 1GbE media standard, 1000BASE-T, does not use 8B10B. It
applies much more complicated 8bit to 12-bit mapping with multiple
mappings corresponding to each input octet.
Strictly speaking, classic 8B10B also has multiple (2) mappings for
some of input octets, but by comparison it is very simple.

If we are going to believe Wikipedia, another extant copper 1GbE
variant, 1000BASE-T1, uses 80->81 bit code.

Re: Bit sequences with bounded number of continuous 1s.

<ukr44o$vi39$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35511&group=comp.arch#35511

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 6 Dec 2023 16:37:43 -0800
Organization: A noiseless patient Spider
Lines: 47
Message-ID: <ukr44o$vi39$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 7 Dec 2023 00:37:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="4095cb928f6ef1e4d195307583ad3c4f";
logging-data="1034345"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19eMUoO4B1+lhp8YdkNAV6KHlcGIs6i37w="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:9VwqvxUwbe7hBPTFf9VTYRIwD7E=
In-Reply-To: <20231206123011.000031e8@yahoo.com>
Content-Language: en-US

by: Chris M. Thomasson - Thu, 7 Dec 2023 00:37 UTC

On 12/6/2023 2:30 AM, Michael S wrote:
> Bit sequences with bounded number of continuous '1' bits are useful for
> serialization of packetized data.
>
> For example, if we have a way of converting arbitrary n-bit words to
> unique (n+1)-bit extended words with at most L continuous leading '1'
> bits, at most T continuous trailing '1' bits and at most M=L+T
> continuous inner '1' bits then we can transfer multi-word packets
> starting with a header consisting of M+1 '1' bits followed by a single
> '0' bit and then by desired number of extended words. Any gaps between
> packets are filled with '1' bits.
> Receiver can easily synchronize itself to incoming packets and will
> quickly and robustly recovers after media errors.
> Such conversions have are other application as well. They always have.
>
> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> particular practical interest.
>
> Brute force counting proves that both for (L=1,T=3) and for (L=2,T=2)
> there exist more than 2**16 extended words with desired properties.
> For those interested, 76735 and 83665 extended words, respectively.
> So, desired 16b-to-17b conversions certainly exist for M=4.
> The problem is that I know of no simple "algorithmic" way of
> implementing such conversion short of using big look-up tables.

There almost has to be a "direct" (algorithmic) method without using
external lookup tables... Humm...

> Neither
> in software nor in simple compact hardware.
> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> where I know many such algorithms.
>
> Any ideas?
> May be, somebody had seen such methods in patents, preferably old and
> outdated?
> Or, better, in books or articles?
>
>
>

For some god damn reason this makes me think of an atomic 63 bit counter:

https://groups.google.com/g/comp.lang.asm.x86/c/FScbTaQEYLc

Not exactly sure why yet...

Re: Bit sequences with bounded number of continuous 1s.

<864jguwv1s.fsf@linuxsc.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35514&group=comp.arch#35514

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Wed, 06 Dec 2023 19:28:15 -0800
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <864jguwv1s.fsf@linuxsc.com>
References: <20231206123011.000031e8@yahoo.com> <868r67w9fq.fsf@linuxsc.com> <20231206200105.00005b68@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="988f1d4fd55cadd4317632d66d2ab3c1";
logging-data="1197628"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19lZfCxYfLFQcWiiYFHvsSRsFirB/1O40E="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:YfCP1zls0ChtWxM4/x70TvNYeW4=
sha1:L1n8oCXwy1s+IS4QWy1Lh7kTWFA=

by: Tim Rentsch - Thu, 7 Dec 2023 03:28 UTC

Michael S <already5chosen@yahoo.com> writes:

> On Wed, 06 Dec 2023 09:02:49 -0800
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Michael S <already5chosen@yahoo.com> writes:
>>
>>> Bit sequences with bounded number of continuous '1' bits are useful
>>> for serialization of packetized data.
>>>
>>> For example, if we have a way of converting arbitrary n-bit words to
>>> unique (n+1)-bit extended words with at most L continuous leading
>>> '1' bits, at most T continuous trailing '1' bits and at most M=L+T
>>> continuous inner '1' bits then we can transfer multi-word packets
>>> starting with a header consisting of M+1 '1' bits followed by a
>>> single '0' bit and then by desired number of extended words. Any
>>> gaps between packets are filled with '1' bits.
>>> Receiver can easily synchronize itself to incoming packets and will
>>> quickly and robustly recovers after media errors.
>>> Such conversions have are other application as well. They always
>>> have.
>>>
>>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
>>> particular practical interest.
>>>
>>> Brute force counting proves that both for (L=1,T=3) and for
>>> (L=2,T=2) there exist more than 2**16 extended words with desired
>>> properties. For those interested, 76735 and 83665 extended words,
>>> respectively. So, desired 16b-to-17b conversions certainly exist
>>> for M=4. The problem is that I know of no simple "algorithmic" way
>>> of implementing such conversion short of using big look-up tables.
>>> Neither in software nor in simple compact hardware.
>>> That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
>>> where I know many such algorithms.
>>>
>>> Any ideas?
>>
>> Please excuse my not following.. I don't quite see what you're
>> getting at here. I get that you want to transform an arbitrary
>> 16-bit value into a 17-bit value that satisfies a certain
>> property (such that the 16-bit value can be reconstructed), but I
>> don't get what the property is exactly or how a 17-bit value is
>> transformed back into a 16-bit value. I do realize that what
>> you're looking for is a nice way of turning a 16-bit value into a
>> 17-bit value (assuming I'm not completely confused, which is
>> always a possibility). Can you spell it out for me in a bit more
>> detail?
>
> I thought it was rather obvious but rather obviously it was not :(
> So let's do it again.
>
> 1. Reversibility.
> No two 16-bit input patterns correspond to same 17-bit output pattern.
> 2. Leading '1' bits.
> Output patterns that have L+1 or more consecutive '1' MS bits are
> illegal.
> 3. Trailing '1' bits.
> Output patterns that have T+1 or more consecutive '1' LS bits are
> illegal.
> 4. Inner '1' bits.
> Output patterns that have L+T+1 or more consecutive '1' bits starting
> at any position are illegal.
>
> (L,T) = either (1,3) or (2,2).

Okay, I get it now.

Here is an idea, such as it is. Clunky for software, might
be okay for hardware.

Subdivide the input space by looking at the top six bits. There
are three ranges (with (L,T) = (2,2))

000000... to 100011... maps to 0xxxxx...
100100... to 110110... maps to 10xxxx...
110111... to 111111... maps to 110xxx...

which are

36864 values of 47338 code points (77.87 percent)
19456 values of 24079 code points (80.80 percent)
9216 values of 12248 code points (75.24 percent)
===== =====
65536 83665

Continue subdividing recursively, choosing nice value ranges in
binary to correspond to subdivisions in the output space. After
the first level of dividing, subdivisions have five output ranges
to choose from rather than just three (until we reach the end, of
course). For example, values in the range

[ 100000.. to 100100.. ) could be mapped to 01110xxx and 011110xxx

(4096 values of 3169 + 1612 = 4781 code points, 85.67 percent)

All the above just playing around by hand after the initial idea
of subdividing based on a small number of leading bits.

Not sure if this approach is good enough for you. I admit it's
not pretty. My sense is it's plausible, even if rather ugly.
And who knows, it may spark a different idea or a better approach.

Re: Bit sequences with bounded number of continuous 1s.

<20231207113516.00004170@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35518&group=comp.arch#35518

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.furie.org.uk!usenet.goja.nl.eu.org!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 11:35:16 +0200
Organization: A noiseless patient Spider
Lines: 74
Message-ID: <20231207113516.00004170@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukqru8$ueqe$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Injection-Info: dont-email.me; posting-host="16c970a4c64ab584198b3a264c6cfa20";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+mHeoVPeLMCF2lZNU7+6BqNNtFfDqzqjA="
Cancel-Lock: sha1:qdHNF8EtsjSm5AlQu4Il9nrbu6o=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Thu, 7 Dec 2023 09:35 UTC

On Wed, 6 Dec 2023 16:17:41 -0600
BGB <cr88192@gmail.com> wrote:

> On 12/6/2023 4:30 AM, Michael S wrote:
> > Bit sequences with bounded number of continuous '1' bits are useful
> > for serialization of packetized data.
> >
> > For example, if we have a way of converting arbitrary n-bit words to
> > unique (n+1)-bit extended words with at most L continuous leading
> > '1' bits, at most T continuous trailing '1' bits and at most M=L+T
> > continuous inner '1' bits then we can transfer multi-word packets
> > starting with a header consisting of M+1 '1' bits followed by a
> > single '0' bit and then by desired number of extended words. Any
> > gaps between packets are filled with '1' bits.
> > Receiver can easily synchronize itself to incoming packets and will
> > quickly and robustly recovers after media errors.
> > Such conversions have are other application as well. They always
> > have.
> >
> > For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> > particular practical interest.
> >
> > Brute force counting proves that both for (L=1,T=3) and for
> > (L=2,T=2) there exist more than 2**16 extended words with desired
> > properties. For those interested, 76735 and 83665 extended words,
> > respectively. So, desired 16b-to-17b conversions certainly exist
> > for M=4. The problem is that I know of no simple "algorithmic" way
> > of implementing such conversion short of using big look-up tables.
> > Neither in software nor in simple compact hardware.
> > That is in strong contrast to case of M=5 (both L=1,T=4 and L=2,T=3)
> > where I know many such algorithms.
> >
> > Any ideas?
> > May be, somebody had seen such methods in patents, preferably old
> > and outdated?
> > Or, better, in books or articles?
> >
>
> For hardware, 8b/10b looks like a more sane option.
>

8B10B is indeed simple, but an overhead is high.
The size of my packets vary between 4 and 33 16-bit words.
A protocol that was in use in previous generation of this product was a
simple one - each 16-bit word is prefixed by a single '0' bit, gaps
between packets are filled with 1s. This protocol requires at least
17-bit gaps between packets so maximal packet rate for a given number of
words N derived as: Rate = 1/(17*N+17).
With 8B10B the formula is: Rate = 1/(20*N+10).
One can see that 8B10B is strictly slower than an old simple scheme for
all possible packet sizes. It happens to be just a little slower (1/85
vs 1/90) for smallest packets, but quite a lot slower (1/578 vs 1/670)
for biggest packets. It is also more complicated, esp. on receiver side.
Another disadvantage of 8B10B is that after link is established packets
can be started only on 10-bit boundaries. When the rate of packets is
not a multiple of 10 clocks (which is most of the time) it adds jitter
to link delay.

Conclusion: 8B10B is not an improvement on old solution. In fact, it is
an opposite of improvement.

Now, you can ask why I want improvement at all?
The answer is: mostly because I want the ability to transfer shortest
packets (4 words) at rate of 1/84 clocks. My old protocol gives 1/85.
From practical perspective very little is missing and rather simple
improvement in encoding will do the job. The desire to produce encoded
streams with at most 4 continuous 1s is rooted in the pursuit for
theoretical perfection rather than the real-world requirements of
moment.

Re: Bit sequences with bounded number of continuous 1s.

<20231207121145.00002eab@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35519&group=comp.arch#35519

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 12:11:45 +0200
Organization: A noiseless patient Spider
Lines: 123
Message-ID: <20231207121145.00002eab@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<868r67w9fq.fsf@linuxsc.com>
<20231206200105.00005b68@yahoo.com>
<864jguwv1s.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="16c970a4c64ab584198b3a264c6cfa20";
logging-data="885858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18QWL4r457GC/qolq9ortlVW4bRk0L9tGA="
Cancel-Lock: sha1:AuizhLcPR4Uy+9xkoL2YgfX8GpY=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Thu, 7 Dec 2023 10:11 UTC

On Wed, 06 Dec 2023 19:28:15 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

> Michael S <already5chosen@yahoo.com> writes:
>
> > On Wed, 06 Dec 2023 09:02:49 -0800
> > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >
> >> Michael S <already5chosen@yahoo.com> writes:
> >>
> >>> Bit sequences with bounded number of continuous '1' bits are
> >>> useful for serialization of packetized data.
> >>>
> >>> For example, if we have a way of converting arbitrary n-bit words
> >>> to unique (n+1)-bit extended words with at most L continuous
> >>> leading '1' bits, at most T continuous trailing '1' bits and at
> >>> most M=L+T continuous inner '1' bits then we can transfer
> >>> multi-word packets starting with a header consisting of M+1 '1'
> >>> bits followed by a single '0' bit and then by desired number of
> >>> extended words. Any gaps between packets are filled with '1'
> >>> bits. Receiver can easily synchronize itself to incoming packets
> >>> and will quickly and robustly recovers after media errors.
> >>> Such conversions have are other application as well. They always
> >>> have.
> >>>
> >>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
> >>> particular practical interest.
> >>>
> >>> Brute force counting proves that both for (L=1,T=3) and for
> >>> (L=2,T=2) there exist more than 2**16 extended words with desired
> >>> properties. For those interested, 76735 and 83665 extended words,
> >>> respectively. So, desired 16b-to-17b conversions certainly exist
> >>> for M=4. The problem is that I know of no simple "algorithmic"
> >>> way of implementing such conversion short of using big look-up
> >>> tables. Neither in software nor in simple compact hardware.
> >>> That is in strong contrast to case of M=5 (both L=1,T=4 and
> >>> L=2,T=3) where I know many such algorithms.
> >>>
> >>> Any ideas?
> >>
> >> Please excuse my not following.. I don't quite see what you're
> >> getting at here. I get that you want to transform an arbitrary
> >> 16-bit value into a 17-bit value that satisfies a certain
> >> property (such that the 16-bit value can be reconstructed), but I
> >> don't get what the property is exactly or how a 17-bit value is
> >> transformed back into a 16-bit value. I do realize that what
> >> you're looking for is a nice way of turning a 16-bit value into a
> >> 17-bit value (assuming I'm not completely confused, which is
> >> always a possibility). Can you spell it out for me in a bit more
> >> detail?
> >
> > I thought it was rather obvious but rather obviously it was not :(
> > So let's do it again.
> >
> > 1. Reversibility.
> > No two 16-bit input patterns correspond to same 17-bit output
> > pattern. 2. Leading '1' bits.
> > Output patterns that have L+1 or more consecutive '1' MS bits are
> > illegal.
> > 3. Trailing '1' bits.
> > Output patterns that have T+1 or more consecutive '1' LS bits are
> > illegal.
> > 4. Inner '1' bits.
> > Output patterns that have L+T+1 or more consecutive '1' bits
> > starting at any position are illegal.
> >
> > (L,T) = either (1,3) or (2,2).
>
> Okay, I get it now.
>
> Here is an idea, such as it is. Clunky for software, might
> be okay for hardware.
>
> Subdivide the input space by looking at the top six bits. There
> are three ranges (with (L,T) = (2,2))
>
> 000000... to 100011... maps to 0xxxxx...
> 100100... to 110110... maps to 10xxxx...
> 110111... to 111111... maps to 110xxx...
>
> which are
>
> 36864 values of 47338 code points (77.87 percent)
> 19456 values of 24079 code points (80.80 percent)
> 9216 values of 12248 code points (75.24 percent)
> ===== =====
> 65536 83665
>
> Continue subdividing recursively, choosing nice value ranges in
> binary to correspond to subdivisions in the output space. After
> the first level of dividing, subdivisions have five output ranges
> to choose from rather than just three (until we reach the end, of
> course). For example, values in the range
>
> [ 100000.. to 100100.. ) could be mapped to 01110xxx and 011110xxx
>
> (4096 values of 3169 + 1612 = 4781 code points, 85.67 percent)
>
> All the above just playing around by hand after the initial idea
> of subdividing based on a small number of leading bits.
>
> Not sure if this approach is good enough for you. I admit it's
> not pretty. My sense is it's plausible, even if rather ugly.
> And who knows, it may spark a different idea or a better approach.

I don't understand a "recursively" part.
May be, more detailed description of the second stage will help.

The key question I don't understand is whether boundaries used at
the stage n+1 depend on the output bits produced at stage n or
boundaries applied to residue of the input word at given stage are
always the same.
If the former, it seems that we end up with the same 64 Kword table
size as in direct approach.
If the later, it can be nice. But for that to work at each stage you
probably want to subdivide to sub-spaces with approximately equal
numbers of code points. So, may be, 01x, 01x and 1xx at first step?
Or just 0x and 1x, but then it degenerates into my current
"conversion to sub-binary non-constant base" algorithm.
And for my current algorithm I know for sure that "nice value ranges"
as you put it, can be used only at 5 initial stages. "Non nice" ranges
are required at stage 6 and later.

Re: Bit sequences with bounded number of continuous 1s.

<uksc75$18jkf$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35520&group=comp.arch#35520

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 13:01:40 +0100
Organization: A noiseless patient Spider
Lines: 143
Message-ID: <uksc75$18jkf$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me>
<20231206154145.000016a4@yahoo.com> <ukq553$qnva$1@dont-email.me>
<20231206194307.00003a7a@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 7 Dec 2023 12:01:41 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c07870ee0e2eb942f163d041246bc6d5";
logging-data="1330831"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19d1y92mHJGSubYGGhenxIKgk3k04LXseI="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:Mqr7yCIc1sVZK51PwdZUrZDsmWU=
In-Reply-To: <20231206194307.00003a7a@yahoo.com>
Content-Language: en-GB

by: David Brown - Thu, 7 Dec 2023 12:01 UTC

On 06/12/2023 18:43, Michael S wrote:
> On Wed, 6 Dec 2023 16:48:51 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 06/12/2023 14:41, Michael S wrote:
>>> On Wed, 6 Dec 2023 13:06:29 +0100
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>

>
> You still refused to read my original post without skipping words.
> In particular you obviously did not read this part "and at most M=L+T
> continuous inner '1' bits".

I did not "refuse" to do anything, but now that you have pointed this
out, I see I misread a couple of things from the first post. Sorry for
my confusion there, leading to some unnecessary concerns in my post as
you had already covered the points.

>>
>>> Anyway, when somewhat higher density of transitions is desirable,
>>> say, one transition (i.e. logical 1) per 25 or 30 bits, it's
>>> absolutely trivial to add at no additional coding cost since 76735
>>> is sufficiently large than 65736.
>>
>> Certainly you can make sure you have enough ones in the packet. (I
>> would not say it is "trivial" until you have found a simple algorithm
>> for coding and decoding to match your requirements - but it is
>> unlikely to make the problem any harder.)
>>
>
> In the mean time I found an algorithm that works. Sort of.
> Modification that causes it to generate no more than 5 leading zeros
> and no more than 16 trailing zeros per extended word is indeed trivial.
>
> I am not fully satisfied with this algorithm for two reasons:
> A. It's sort of ad hoc. It works for 16->17, but I am not sure that
> similar algorithms are going to work for, say 24->25 or 28->29.
> B. The hardware implementation (in FPGA) is not as compact as I would
> like. There are too many non-trivial constants involved and non-trivial
> constants consume LEs.
>

You are trying to implement this in a small number of LE's in an FPGA,
right? I'm guessing you are looking at a linear shift register
arrangement of some kind, since they are amenable to compact FPGA
implementations (unlike a big lookup table - which would give a simple
solution if you could afford the space).

But it does all sound somewhat arbitrary here. Why do you pick these
specific numbers? It seems you are not concerned about errors other
than synchronisation (presumably you handle any required error checking
or correcting at a higher level). Since you have a common clock source,
are we talking about a pair of FPGAs on the same board here?

What sort of packet sizes are you using, and how critical is the
bandwidth utilisation? Could you not massively simplify it by picking M
as 16? Each 16 bits of data could be transferred as a frame consisting
of a zero followed by the original data. Inter-packet gaps will be all
ones, and you need a minimum gap of 17 ones before a new packet.
Compared to your earlier suggestions this would increase the minimum gap
time, but would make the implementation trivial.

Of course if this is all motivated more by interest and the fun and
challenge of trying to find such algorithms, rather than practical
necessity, then I'll stop trying to look at the practicalities and try
to think more about the coding patterns. Fun and interest is always a
good reason.

>>>
>>> DC balance is not required in my case. When DC balance is required
>>> at all, the balance requirements tend to be rather strict. Weak
>>> balance is rarely of interest.
>>
>> Weak balance - a statistical or probabilistic balance - is certainly
>> of interest in some systems. PCIe communication, for instance,
>> maintains an approximate balance by using a scrambler. Some DC
>> "leakage" through the terminators is enough to deal with the
>> remainder, if I have understood things correctly.
>>
>
> PCIe Gen1 and Gen2 apply 8b-10b encoding that guarantees rather strong
> DC balance. As observed on 10T intervals, disparity never exceeds two
> bits. It is possible due too high coding rate of 5/4.

Yes.

> For later PCIe generations DC balance is indeed non-robust. I suspect
> that hardware here is designed to transmit/receive totally non-balanced
> streams. With more balanced streams it just works better == has higher
> noise margin.
> But I admit to not learning newer PCIe phys to the same degree I
> learned earlier generations.

I don't know the details either, but there are higher level error
checking mechanisms for later PCIe versions. So if you are unlucky
enough to have data patterns that give too high DC bias, causing errors,
these will lead to retransmissions. And I would assume the scrambler
phase would be different even when re-transmitting the same original
data, avoiding a repeat of the problem. Anyway, your transmission
always has to cope with /some/ DC bias, even if it is small, because
it's all analogue and never entirely precise. Leaking away the bias
through terminators is easy enough - you just don't want to have to
handle too much, because it costs power and slows the signal.

>>
>> I can't say that I really understand why you have the requirements
>> you are asking for here - they don't make sense to me. Obviously I
>> don't have a full view of your system, but I cannot currently imagine
>> a system for which your requirements would be the best way to solve
>> the problem.
>>
>>
>
> I tend to think about coding techniques 'for transparency' as
> intellectual assets. You want to have more of them even in the absence
> of immediate application.

OK, that makes sense, and I fully agree on that principle.

> My absolute favorite is Consistent Overhead Byte Stuffing a.k.a. COBS.
> It featured in few proposals, but never made it to any major standard.
> Despite it, it's a work of beauty and I use variants of COBS at work
> rather intensely, typically for communication between parts of our
> embedded systems.

It is certainly an interesting solution to a common problem. It is a
long time since I heard about COBS - thank you for reminding me of it!

>
> BTW, ideas of COBS are applicable among other approaches to avoiding
> 5-long bit patterns. Not so much to 4-long bit patterns.
> The difference here is that avoidance of 5-long patterns can be
> attacked by splitting the word into 3-bit and 4-bit symbols and as long
> as there are symbols COBS has a chance.
> For 4-long patterns partitioning to 4-bit symbols, even when
> interleaved with 3-bit symbols, does not work for an obvious reasons.
> And partitioning [of 17-bit extended word] to 3-bit symbols does not
> work for even more obvious reason: 7**5 * 3 < 2**16.
>

Re: Bit sequences with bounded number of continuous 1s.

<RplcN.2$xHn7.1@fx14.iad>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35527&group=comp.arch#35527

copy link Newsgroups: comp.arch

Path: i2pn2.org!rocksolid2!news.neodome.net!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Bit sequences with bounded number of continuous 1s.
Newsgroups: comp.arch
References: <20231206123011.000031e8@yahoo.com> <ukqru8$ueqe$1@dont-email.me>
Lines: 14
Message-ID: <RplcN.2$xHn7.1@fx14.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 07 Dec 2023 15:21:21 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 07 Dec 2023 15:21:21 GMT
X-Received-Bytes: 956

by: Scott Lurndal - Thu, 7 Dec 2023 15:21 UTC

BGB <cr88192@gmail.com> writes:
>On 12/6/2023 4:30 AM, Michael S wrote:

>> May be, somebody had seen such methods in patents, preferably old and
>> outdated?
>> Or, better, in books or articles?
>>
>
>For hardware, 8b/10b looks like a more sane option.

Although that has 25% overhead.

Using 64b/66b reduces the overhead to 3.125%.

Re: Bit sequences with bounded number of continuous 1s.

<%rlcN.3$xHn7.1@fx14.iad>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35528&group=comp.arch#35528

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Bit sequences with bounded number of continuous 1s.
Newsgroups: comp.arch
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me> <ukqnfb$tjh5$1@dont-email.me> <20231207021135.00006e20@yahoo.com>
Lines: 73
Message-ID: <%rlcN.3$xHn7.1@fx14.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 07 Dec 2023 15:23:39 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 07 Dec 2023 15:23:39 GMT
X-Received-Bytes: 4020

by: Scott Lurndal - Thu, 7 Dec 2023 15:23 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Wed, 6 Dec 2023 22:01:31 +0100
>Terje Mathisen <terje.mathisen@tmsw.no> wrote:
>
>> David Brown wrote:
>> > On 06/12/2023 11:30, Michael S wrote: =20
>> >> Bit sequences with bounded number of continuous '1' bits are
>> >> useful for serialization of packetized data.
>> >>
>> >> For example, if we have a way of converting arbitrary n-bit words
>> >> to unique (n+1)-bit extended words with at most L continuous
>> >> leading '1' bits, at most T continuous trailing '1' bits and at
>> >> most M=3DL+T continuous inner '1' bits then we can transfer
>> >> multi-word packets starting with a header consisting of M+1 '1'
>> >> bits followed by a single '0' bit and then by desired number of
>> >> extended words. Any gaps between packets are filled with '1' bits.
>> >> Receiver can easily synchronize itself to incoming packets and will
>> >> quickly and robustly recovers after media errors.
>> >> Such conversions have are other application as well. They always
>> >> have.
>> >>
>> >> For me, a case of n=3D16 (16-bit words, 17-bit extended words) is of
>> >> particular practical interest.
>> >>
>> >> Brute force counting proves that both for (L=3D1,T=3D3) and for
>> >> (L=3D2,T=3D2) there exist more than 2**16 extended words with desired
>> >> properties. For those interested, 76735 and 83665 extended words,
>> >> respectively. So, desired 16b-to-17b conversions certainly exist
>> >> for M=3D4. The problem is that I know of no simple "algorithmic" way
>> >> of implementing such conversion short of using big look-up tables.
>> >> Neither in software nor in simple compact hardware.
>> >> That is in strong contrast to case of M=3D5 (both L=3D1,T=3D4 and
>> >> L=3D2,T=3D3) where I know many such algorithms.
>> >>
>> >> Any ideas?
>> >> May be, somebody had seen such methods in patents, preferably old
>> >> and outdated?
>> >> Or, better, in books or articles?
>> >>
>> >> =20
>> >=20
>> > Usually the interest is in limiting length of repeats at any point
>> > in the data, not just the start and end.=A0 After all, it's not going
>> > to help anyone if the start and end of the word are limited to no
>> > more than 2 consecutive ones if you have a run of 10 ones in the
>> > middle of the word.
>> >=20
>> > It is also common to want to combine other encoding features at the
>> > same time.=A0 Limiting the lengths of both zeros and ones means you
>> > have transitions, which are needed for clock recovery and
>> > synchronisation. For many transmissions, approximately 0 DC balance
>> > is highly desirable. And you may also want forward error correction
>> > or at least error detection. =20
>>=20
>> AFAIR an 8->10 encoding is used for all gigabit ethernet links, for=20
>> exactly those purposes.
>>=20
>> Terje
>>=20
>>=20
>
>In 1GbE world 8B10B is used only for one of several available variants
>of MAC-to-PHY interface (SGMII) and by optical media (all sub-variants
>of 1000BaseX).
>
>The dominant 1GbE media standard, 1000BASE-T, does not use 8B10B. It
>applies much more complicated 8bit to 12-bit mapping with multiple
>mappings corresponding to each input octet.

And 10Gbe uses 64b/66b which just requires a two-bit prefix that
can only have the values 0b01 or 0b10.

PCIe 3.0 uses 128b/130b.

Re: Bit sequences with bounded number of continuous 1s.

<20231207173901.000044bd@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35529&group=comp.arch#35529

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 17:39:01 +0200
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <20231207173901.000044bd@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukpo46$ok6v$1@dont-email.me>
<20231206154145.000016a4@yahoo.com>
<ukq553$qnva$1@dont-email.me>
<20231206194307.00003a7a@yahoo.com>
<uksc75$18jkf$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="16c970a4c64ab584198b3a264c6cfa20";
logging-data="1385180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18tvdnX26l0JT6fJ6QzWJJ5lsl2EXSj/XU="
Cancel-Lock: sha1:mpW5Xa7EqRvPhmNBPMCi4Szfn4o=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Thu, 7 Dec 2023 15:39 UTC

On Thu, 7 Dec 2023 13:01:40 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 06/12/2023 18:43, Michael S wrote:
> > On Wed, 6 Dec 2023 16:48:51 +0100
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 06/12/2023 14:41, Michael S wrote:
> >>> On Wed, 6 Dec 2023 13:06:29 +0100
> >>> David Brown <david.brown@hesbynett.no> wrote:
> >>>
>
> >
> > You still refused to read my original post without skipping words.
> > In particular you obviously did not read this part "and at most
> > M=L+T continuous inner '1' bits".
>
> I did not "refuse" to do anything, but now that you have pointed this
> out, I see I misread a couple of things from the first post. Sorry
> for my confusion there, leading to some unnecessary concerns in my
> post as you had already covered the points.
>
> >>
> >>> Anyway, when somewhat higher density of transitions is desirable,
> >>> say, one transition (i.e. logical 1) per 25 or 30 bits, it's
> >>> absolutely trivial to add at no additional coding cost since 76735
> >>> is sufficiently large than 65736.
> >>
> >> Certainly you can make sure you have enough ones in the packet. (I
> >> would not say it is "trivial" until you have found a simple
> >> algorithm for coding and decoding to match your requirements - but
> >> it is unlikely to make the problem any harder.)
> >>
> >
> > In the mean time I found an algorithm that works. Sort of.
> > Modification that causes it to generate no more than 5 leading zeros
> > and no more than 16 trailing zeros per extended word is indeed
> > trivial.
> >
> > I am not fully satisfied with this algorithm for two reasons:
> > A. It's sort of ad hoc. It works for 16->17, but I am not sure that
> > similar algorithms are going to work for, say 24->25 or 28->29.
> > B. The hardware implementation (in FPGA) is not as compact as I
> > would like. There are too many non-trivial constants involved and
> > non-trivial constants consume LEs.
> >
>
> You are trying to implement this in a small number of LE's in an
> FPGA, right? I'm guessing you are looking at a linear shift register
> arrangement of some kind, since they are amenable to compact FPGA
> implementations (unlike a big lookup table - which would give a
> simple solution if you could afford the space).
>

LFSR that does the work would be great. But I don't think that it is
even remotely possible. Or, may be, I am not enough of Galois
algebraist to think about how is it possible.
So far all my approaches were either logical (chained symbol
substitution) or arithmetic (conversion to numeric systems with
non-power-of-two base). In specific case of L+T=4 all algorithms that
I tried so far were of later variety.
FPGAs, except of *very* old ones, are actually quite good at addition
and subtraction as long as your adder is not too wide. 16-20 bits are
always o.k.

> But it does all sound somewhat arbitrary here. Why do you pick these
> specific numbers? It seems you are not concerned about errors other
> than synchronisation (presumably you handle any required error
> checking or correcting at a higher level). Since you have a common
> clock source, are we talking about a pair of FPGAs on the same board
> here?
>

Different boards in the same relatively small rack.
Transmitter size is not important, because there is only one
transmitter per (data acquisition) board.
Receiver size is relatively important, because there are many receivers
per (concentrator) board.

> What sort of packet sizes are you using, and how critical is the
> bandwidth utilisation? Could you not massively simplify it by
> picking M as 16? Each 16 bits of data could be transferred as a
> frame consisting of a zero followed by the original data.
> Inter-packet gaps will be all ones, and you need a minimum gap of 17
> ones before a new packet. Compared to your earlier suggestions this
> would increase the minimum gap time, but would make the
> implementation trivial.
>
>
> Of course if this is all motivated more by interest and the fun and
> challenge of trying to find such algorithms, rather than practical
> necessity, then I'll stop trying to look at the practicalities and
> try to think more about the coding patterns. Fun and interest is
> always a good reason.
>

Most of your above questions/suggestions answered above in my reply to
BGB.

Re: Bit sequences with bounded number of continuous 1s.

<uksq3m$1amse$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35530&group=comp.arch#35530

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.chmurka.net!nntp.terraraq.uk!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: robfi680@gmail.com (Robert Finch)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 10:58:45 -0500
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <uksq3m$1amse$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukqru8$ueqe$1@dont-email.me>
<RplcN.2$xHn7.1@fx14.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 7 Dec 2023 15:58:46 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f08aebda0808298d41bc4fa38271ff10";
logging-data="1399694"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/9XUmig3Yk4rpfd7rtIZ9Z4PXRXx2obbM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:w77FjmiFJH7ciSweLsASv00GJLU=
Content-Language: en-US
In-Reply-To: <RplcN.2$xHn7.1@fx14.iad>

by: Robert Finch - Thu, 7 Dec 2023 15:58 UTC

On 2023-12-07 10:21 a.m., Scott Lurndal wrote:
> BGB <cr88192@gmail.com> writes:
>> On 12/6/2023 4:30 AM, Michael S wrote:
>
>>> May be, somebody had seen such methods in patents, preferably old and
>>> outdated?
>>> Or, better, in books or articles?
>>>
>>
>> For hardware, 8b/10b looks like a more sane option.
>
> Although that has 25% overhead.
>
> Using 64b/66b reduces the overhead to 3.125%.
>
I used 14b/12b for my own computer bus as 14-bits is the maximum size of
a shift register in an FPGA DDR output block. The bus used three
simultaneous channels like HDMI. The transceivers are modified HDMI
cores. IIRC it transferred data in 36-bit packets. Got the bus spec’d
out and the cores written, but never put into use. I was going to use it
to interconnect multiple FPGA boards at high speed.

Re: Bit sequences with bounded number of continuous 1s.

<uksuag$1b4l5$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35532&group=comp.arch#35532

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 09:10:40 -0800
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <uksuag$1b4l5$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me>
<ukqnfb$tjh5$1@dont-email.me> <20231207021135.00006e20@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 7 Dec 2023 17:10:40 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0103faf3846b22413985cb00f3890093";
logging-data="1413797"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+iJ/3EUW+AKBfM98/v2zj6JBYv9FNK33s="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:OPqI2+FgW2FMFaNvIFeIjP/Ty/4=
Content-Language: en-US
In-Reply-To: <20231207021135.00006e20@yahoo.com>

by: Stephen Fuld - Thu, 7 Dec 2023 17:10 UTC

On 12/6/2023 4:11 PM, Michael S wrote:
> On Wed, 6 Dec 2023 22:01:31 +0100
> Terje Mathisen <terje.mathisen@tmsw.no> wrote:
>
>> David Brown wrote:
>>> On 06/12/2023 11:30, Michael S wrote:
>>>> Bit sequences with bounded number of continuous '1' bits are
>>>> useful for serialization of packetized data.
>>>>
>>>> For example, if we have a way of converting arbitrary n-bit words
>>>> to unique (n+1)-bit extended words with at most L continuous
>>>> leading '1' bits, at most T continuous trailing '1' bits and at
>>>> most M=L+T continuous inner '1' bits then we can transfer
>>>> multi-word packets starting with a header consisting of M+1 '1'
>>>> bits followed by a single '0' bit and then by desired number of
>>>> extended words. Any gaps between packets are filled with '1' bits.
>>>> Receiver can easily synchronize itself to incoming packets and will
>>>> quickly and robustly recovers after media errors.
>>>> Such conversions have are other application as well. They always
>>>> have.
>>>>
>>>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
>>>> particular practical interest.
>>>>
>>>> Brute force counting proves that both for (L=1,T=3) and for
>>>> (L=2,T=2) there exist more than 2**16 extended words with desired
>>>> properties. For those interested, 76735 and 83665 extended words,
>>>> respectively. So, desired 16b-to-17b conversions certainly exist
>>>> for M=4. The problem is that I know of no simple "algorithmic" way
>>>> of implementing such conversion short of using big look-up tables.
>>>> Neither in software nor in simple compact hardware.
>>>> That is in strong contrast to case of M=5 (both L=1,T=4 and
>>>> L=2,T=3) where I know many such algorithms.
>>>>
>>>> Any ideas?
>>>> May be, somebody had seen such methods in patents, preferably old
>>>> and outdated?
>>>> Or, better, in books or articles?
>>>>
>>>>
>>>
>>> Usually the interest is in limiting length of repeats at any point
>>> in the data, not just the start and end. After all, it's not going
>>> to help anyone if the start and end of the word are limited to no
>>> more than 2 consecutive ones if you have a run of 10 ones in the
>>> middle of the word.
>>>
>>> It is also common to want to combine other encoding features at the
>>> same time. Limiting the lengths of both zeros and ones means you
>>> have transitions, which are needed for clock recovery and
>>> synchronisation. For many transmissions, approximately 0 DC balance
>>> is highly desirable. And you may also want forward error correction
>>> or at least error detection.
>>
>> AFAIR an 8->10 encoding is used for all gigabit ethernet links, for
>> exactly those purposes.
>>
>> Terje
>>
>>
>
> In 1GbE world 8B10B is used only for one of several available variants
> of MAC-to-PHY interface (SGMII) and by optical media (all sub-variants
> of 1000BaseX).
>
> The dominant 1GbE media standard, 1000BASE-T, does not use 8B10B. It
> applies much more complicated 8bit to 12-bit mapping with multiple
> mappings corresponding to each input octet.
> Strictly speaking, classic 8B10B also has multiple (2) mappings for
> some of input octets, but by comparison it is very simple.
>
> If we are going to believe Wikipedia, another extant copper 1GbE
> variant, 1000BASE-T1, uses 80->81 bit code.

8B/10B is/was also used for lots of serial interconnets, including fibre
channel, ESCON, Infiniband, etc.

But also according to Wikipedia, for 10Gb Ethernet, it has been replaced
by 64B/66B, which is obviously much more space efficient.

https://en.wikipedia.org/wiki/64b/66b_encoding

But the part potentially relevant to the OP is that it uses an algorithm
called "scrambling" to address the encoding/decoding problem, as a
look-up table is obviously impractical. Perhaps the scrambling
algorithm could be adapted for a 16B/17B application. The article
includes links to descriptions, etc.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Bit sequences with bounded number of continuous 1s.

<20231207200356.0000490b@yahoo.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35533&group=comp.arch#35533

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Thu, 7 Dec 2023 20:03:56 +0200
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <20231207200356.0000490b@yahoo.com>
References: <20231206123011.000031e8@yahoo.com>
<ukpo46$ok6v$1@dont-email.me>
<ukqnfb$tjh5$1@dont-email.me>
<20231207021135.00006e20@yahoo.com>
<uksuag$1b4l5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="16c970a4c64ab584198b3a264c6cfa20";
logging-data="1385180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+6LWktOrqCq+bbLPZw04qkk2JlC63DxXw="
Cancel-Lock: sha1:6BYm6ukAB2fHmgKYIvqJOqt37jA=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)

by: Michael S - Thu, 7 Dec 2023 18:03 UTC

On Thu, 7 Dec 2023 09:10:40 -0800
Stephen Fuld <sfuld@alumni.cmu.edu.invalid> wrote:
>
> But the part potentially relevant to the OP is that it uses an
> algorithm called "scrambling" to address the encoding/decoding
> problem, as a look-up table is obviously impractical. Perhaps the
> scrambling algorithm could be adapted for a 16B/17B application. The
> article includes links to descriptions, etc.
>

I don't believe that scrambling could be useful.
Scrambling is great statistically, much less great when you want to
guarantee any particular property with absolute certainty.

Re: Bit sequences with bounded number of continuous 1s.

<R5ocN.3618$83n7.1932@fx18.iad>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35534&group=comp.arch#35534

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.samoylyk.net!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx18.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Bit sequences with bounded number of continuous 1s.
Newsgroups: comp.arch
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me> <ukqnfb$tjh5$1@dont-email.me> <20231207021135.00006e20@yahoo.com> <uksuag$1b4l5$1@dont-email.me> <20231207200356.0000490b@yahoo.com>
Lines: 19
Message-ID: <R5ocN.3618$83n7.1932@fx18.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 07 Dec 2023 18:24:49 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 07 Dec 2023 18:24:49 GMT
X-Received-Bytes: 1517

by: Scott Lurndal - Thu, 7 Dec 2023 18:24 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Thu, 7 Dec 2023 09:10:40 -0800
>Stephen Fuld <sfuld@alumni.cmu.edu.invalid> wrote:
>>
>> But the part potentially relevant to the OP is that it uses an
>> algorithm called "scrambling" to address the encoding/decoding
>> problem, as a look-up table is obviously impractical. Perhaps the
>> scrambling algorithm could be adapted for a 16B/17B application. The
>> article includes links to descriptions, etc.
>>
>
>I don't believe that scrambling could be useful.

10/100G ethernet seems to disagree with that characterization.

Generally scrambled using a LFSR.

100G/200G/400G use PAM-4 (four voltage levels).

Pages:12

server_pubkey.txt

rocksolid light 0.9.8
clearnet tor