Message-ID:

A Linux machine! because a 486 is a terrible thing to waste! (By jjs@wintermute.ucr.edu, Joe Sloan)

devel / comp.arch / Re: Bit sequences with bounded number of continuous 1s.

On Thu, 07 Dec 2023 18:24:49 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

> Michael S <already5chosen@yahoo.com> writes:
> >On Thu, 7 Dec 2023 09:10:40 -0800
> >Stephen Fuld <sfuld@alumni.cmu.edu.invalid> wrote:
> >>
> >> But the part potentially relevant to the OP is that it uses an
> >> algorithm called "scrambling" to address the encoding/decoding
> >> problem, as a look-up table is obviously impractical. Perhaps the
> >> scrambling algorithm could be adapted for a 16B/17B application.
> >> The article includes links to descriptions, etc.
> >>
> >
> >I don't believe that scrambling could be useful.
>
> 10/100G ethernet seems to disagree with that characterization.
>

O.k. Personally for you I'd spell it differently: I don't believe that
scrambling could be useful for generation of 16-bit to 17-bit code words
with strict bound on the number of consecutive '1' bits.

> Generally scrambled using a LFSR.
>
> 100G/200G/400G use PAM-4 (four voltage levels).
>

On 07/12/2023 16:39, Michael S wrote:
> On Thu, 7 Dec 2023 13:01:40 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 06/12/2023 18:43, Michael S wrote:
>>> On Wed, 6 Dec 2023 16:48:51 +0100
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>
>>>> On 06/12/2023 14:41, Michael S wrote:
>>>>> On Wed, 6 Dec 2023 13:06:29 +0100
>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>
>>
>>>
>>> You still refused to read my original post without skipping words.
>>> In particular you obviously did not read this part "and at most
>>> M=L+T continuous inner '1' bits".
>>
>> I did not "refuse" to do anything, but now that you have pointed this
>> out, I see I misread a couple of things from the first post. Sorry
>> for my confusion there, leading to some unnecessary concerns in my
>> post as you had already covered the points.
>>
>>>>
>>>>> Anyway, when somewhat higher density of transitions is desirable,
>>>>> say, one transition (i.e. logical 1) per 25 or 30 bits, it's
>>>>> absolutely trivial to add at no additional coding cost since 76735
>>>>> is sufficiently large than 65736.
>>>>
>>>> Certainly you can make sure you have enough ones in the packet. (I
>>>> would not say it is "trivial" until you have found a simple
>>>> algorithm for coding and decoding to match your requirements - but
>>>> it is unlikely to make the problem any harder.)
>>>>
>>>
>>> In the mean time I found an algorithm that works. Sort of.
>>> Modification that causes it to generate no more than 5 leading zeros
>>> and no more than 16 trailing zeros per extended word is indeed
>>> trivial.
>>>
>>> I am not fully satisfied with this algorithm for two reasons:
>>> A. It's sort of ad hoc. It works for 16->17, but I am not sure that
>>> similar algorithms are going to work for, say 24->25 or 28->29.
>>> B. The hardware implementation (in FPGA) is not as compact as I
>>> would like. There are too many non-trivial constants involved and
>>> non-trivial constants consume LEs.
>>>
>>
>> You are trying to implement this in a small number of LE's in an
>> FPGA, right? I'm guessing you are looking at a linear shift register
>> arrangement of some kind, since they are amenable to compact FPGA
>> implementations (unlike a big lookup table - which would give a
>> simple solution if you could afford the space).
>>
>
> LFSR that does the work would be great. But I don't think that it is
> even remotely possible. Or, may be, I am not enough of Galois
> algebraist to think about how is it possible.

Most of the Galois Field stuff I have done is in connection with
redundancy and error correction for RAID, rather than this kind of
thing. My thought is that some kind of scrambler pattern might be able
to generate the patterns you need, or at least something close as a
first step. But I have no idea how to look for such encodings other
than by trial and error.

> So far all my approaches were either logical (chained symbol
> substitution) or arithmetic (conversion to numeric systems with
> non-power-of-two base). In specific case of L+T=4 all algorithms that
> I tried so far were of later variety.

I can see how that could work, but I would expect it to be too big for
what you want in your FPGAs?

> FPGAs, except of *very* old ones, are actually quite good at addition
> and subtraction as long as your adder is not too wide. 16-20 bits are
> always o.k.
>
>
>> But it does all sound somewhat arbitrary here. Why do you pick these
>> specific numbers? It seems you are not concerned about errors other
>> than synchronisation (presumably you handle any required error
>> checking or correcting at a higher level). Since you have a common
>> clock source, are we talking about a pair of FPGAs on the same board
>> here?
>>
>
> Different boards in the same relatively small rack.
> Transmitter size is not important, because there is only one
> transmitter per (data acquisition) board.
> Receiver size is relatively important, because there are many receivers
> per (concentrator) board.
>

Interesting - that might allow for alternative ideas. So you'd be okay
with needing something like lookup tables for the transmission encoding,
as long as reception decoding can be small? (I had been assuming that
you have more symmetrical hardware requirements.)

>
>> What sort of packet sizes are you using, and how critical is the
>> bandwidth utilisation? Could you not massively simplify it by
>> picking M as 16? Each 16 bits of data could be transferred as a
>> frame consisting of a zero followed by the original data.
>> Inter-packet gaps will be all ones, and you need a minimum gap of 17
>> ones before a new packet. Compared to your earlier suggestions this
>> would increase the minimum gap time, but would make the
>> implementation trivial.
>>
>>
>> Of course if this is all motivated more by interest and the fun and
>> challenge of trying to find such algorithms, rather than practical
>> necessity, then I'll stop trying to look at the practicalities and
>> try to think more about the coding patterns. Fun and interest is
>> always a good reason.
>>
>
> Most of your above questions/suggestions answered above in my reply to
> BGB.
>

OK.

Michael S <already5chosen@yahoo.com> writes:

> On Wed, 06 Dec 2023 19:28:15 -0800
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Michael S <already5chosen@yahoo.com> writes:
>>
>>> On Wed, 06 Dec 2023 09:02:49 -0800
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>
>>>> Michael S <already5chosen@yahoo.com> writes:
>>>>
>>>>> Bit sequences with bounded number of continuous '1' bits are
>>>>> useful for serialization of packetized data.
>>>>>
>>>>> For example, if we have a way of converting arbitrary n-bit words
>>>>> to unique (n+1)-bit extended words with at most L continuous
>>>>> leading '1' bits, at most T continuous trailing '1' bits and at
>>>>> most M=L+T continuous inner '1' bits then we can transfer
>>>>> multi-word packets starting with a header consisting of M+1 '1'
>>>>> bits followed by a single '0' bit and then by desired number of
>>>>> extended words. Any gaps between packets are filled with '1'
>>>>> bits. Receiver can easily synchronize itself to incoming packets
>>>>> and will quickly and robustly recovers after media errors.
>>>>> Such conversions have are other application as well. They always
>>>>> have.
>>>>>
>>>>> For me, a case of n=16 (16-bit words, 17-bit extended words) is of
>>>>> particular practical interest.
>>>>>
>>>>> Brute force counting proves that both for (L=1,T=3) and for
>>>>> (L=2,T=2) there exist more than 2**16 extended words with desired
>>>>> properties. For those interested, 76735 and 83665 extended words,
>>>>> respectively. So, desired 16b-to-17b conversions certainly exist
>>>>> for M=4. The problem is that I know of no simple "algorithmic"
>>>>> way of implementing such conversion short of using big look-up
>>>>> tables. Neither in software nor in simple compact hardware.
>>>>> That is in strong contrast to case of M=5 (both L=1,T=4 and
>>>>> L=2,T=3) where I know many such algorithms.
>>>>>
>>>>> Any ideas?
>>>>
>>>> Please excuse my not following.. I don't quite see what you're
>>>> getting at here. I get that you want to transform an arbitrary
>>>> 16-bit value into a 17-bit value that satisfies a certain
>>>> property (such that the 16-bit value can be reconstructed), but I
>>>> don't get what the property is exactly or how a 17-bit value is
>>>> transformed back into a 16-bit value. I do realize that what
>>>> you're looking for is a nice way of turning a 16-bit value into a
>>>> 17-bit value (assuming I'm not completely confused, which is
>>>> always a possibility). Can you spell it out for me in a bit more
>>>> detail?
>>>
>>> I thought it was rather obvious but rather obviously it was not :(
>>> So let's do it again.
>>>
>>> 1. Reversibility.
>>> No two 16-bit input patterns correspond to same 17-bit output
>>> pattern. 2. Leading '1' bits.
>>> Output patterns that have L+1 or more consecutive '1' MS bits are
>>> illegal.
>>> 3. Trailing '1' bits.
>>> Output patterns that have T+1 or more consecutive '1' LS bits are
>>> illegal.
>>> 4. Inner '1' bits.
>>> Output patterns that have L+T+1 or more consecutive '1' bits
>>> starting at any position are illegal.
>>>
>>> (L,T) = either (1,3) or (2,2).
>>
>> Okay, I get it now.
>>
>> Here is an idea, such as it is. Clunky for software, might
>> be okay for hardware.
>>
>> Subdivide the input space by looking at the top six bits. There
>> are three ranges (with (L,T) = (2,2))
>>
>> 000000... to 100011... maps to 0xxxxx...
>> 100100... to 110110... maps to 10xxxx...
>> 110111... to 111111... maps to 110xxx...
>>
>> which are
>>
>> 36864 values of 47338 code points (77.87 percent)
>> 19456 values of 24079 code points (80.80 percent)
>> 9216 values of 12248 code points (75.24 percent)
>> ===== =====
>> 65536 83665
>>
>> Continue subdividing recursively, choosing nice value ranges in
>> binary to correspond to subdivisions in the output space. After
>> the first level of dividing, subdivisions have five output ranges
>> to choose from rather than just three (until we reach the end, of
>> course). For example, values in the range
>>
>> [ 100000.. to 100100.. ) could be mapped to 01110xxx and 011110xxx
>>
>> (4096 values of 3169 + 1612 = 4781 code points, 85.67 percent)
>>
>> All the above just playing around by hand after the initial idea
>> of subdividing based on a small number of leading bits.
>>
>> Not sure if this approach is good enough for you. I admit it's
>> not pretty. My sense is it's plausible, even if rather ugly.
>> And who knows, it may spark a different idea or a better approach.
>
> I don't understand a "recursively" part.
> May be, more detailed description of the second stage will help.

Probably it won't, but here goes. :)

First level decomposition (values on left, codes on the right). In all
cases values are the lower end of a range.

0 000 00x xxx xxx xxx 0................
1 001 00x xxx xxx xxx 10...............
1 101 11x xxx xxx xxx 110..............

We give an example of decomposing the third range. The third range has
14 code bits remaining, which gives 12248 codes available for use.
The 12248 codes are subdivided as follows

0............. 6230
10............ 3169
110........... 1612
1110.......... 820
11110......... 417

Second level decomposition of the third range above. First we use the
last two prefixes to take care of the odd-shaped portion up to a value
point of 1 110 000 000 000 000

1 101 11x xxx xxx xxx 1024 values with 1237 = 820+417 codes
0 000 xxx xxx 110 1110... 704 values of 820 codes
1 011 xxx xxx 110 11110.. 320 values of 417 codes

(1 110 000 000 000 000 boundary)

After nibbling off the odd-shaped portion, we are left with 8192 values:

1 11x xxx xxx xxx xxx

We subdivide these 8192 values among the first three prefixes

1 11x xxx xxx xxx xxx
0 xxx xxx xxx xxx 110 0...... 4096+512 = 4608 : 6230 codes
1 001 xxx xxx xxx 110 10..... 2048+512 = 2560 : 3169
1 11x xxx xxx xxx 110 110.... 1024 = 1024 : 1622

(10 000 000 000 000 000 boundary)

The principle being used here is to subdivide the value range along
lines that are determined by high order bit positions, chosen so that
each subrange maps onto a code point range that is roughly 4/3 the size
of the value range being mapped.

> The key question I don't understand is whether boundaries used at
> the stage n+1 depend on the output bits produced at stage n or
> boundaries applied to residue of the input word at given stage are
> always the same.
> If the former, it seems that we end up with the same 64 Kword table
> size as in direct approach.

There is dependence but it's of a different kind. What happens
at stage n+1 depends on the upstream value bits. The tests
are done at fixed bit positions, and the results combined by
logic so there is a lot of sharing. I think the growth is
more like linear than exponential.

> If the later, it can be nice. But for that to work at each stage you
> probably want to subdivide to sub-spaces with approximately equal
> numbers of code points.

My approach is to divide the value space into ranges whose sizes
are roughly proportional to the set of code points for the code
prefix used for that set of values.

> So, may be, 01x, 01x and 1xx at first step?
> Or just 0x and 1x, but then it degenerates into my current
> "conversion to sub-binary non-constant base" algorithm.
> And for my current algorithm I know for sure that "nice value ranges"
> as you put it, can be used only at 5 initial stages. "Non nice" ranges
> are required at stage 6 and later.

Probably we mean different things by "nice value range". The approach
I've been describing is not iterative, it's cascading logic.

Click here to read the complete article

On Fri, 08 Dec 2023 22:10:06 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

> Michael S <already5chosen@yahoo.com> writes:
>
> > On Wed, 06 Dec 2023 19:28:15 -0800
> > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >
> >> Michael S <already5chosen@yahoo.com> writes:
> >>
> >>> On Wed, 06 Dec 2023 09:02:49 -0800
> >>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >>>
> >>>> Michael S <already5chosen@yahoo.com> writes:
> >>>>
> >>>>> Bit sequences with bounded number of continuous '1' bits are
> >>>>> useful for serialization of packetized data.
> >>>>>
> >>>>> For example, if we have a way of converting arbitrary n-bit
> >>>>> words to unique (n+1)-bit extended words with at most L
> >>>>> continuous leading '1' bits, at most T continuous trailing '1'
> >>>>> bits and at most M=L+T continuous inner '1' bits then we can
> >>>>> transfer multi-word packets starting with a header consisting
> >>>>> of M+1 '1' bits followed by a single '0' bit and then by
> >>>>> desired number of extended words. Any gaps between packets are
> >>>>> filled with '1' bits. Receiver can easily synchronize itself
> >>>>> to incoming packets and will quickly and robustly recovers
> >>>>> after media errors. Such conversions have are other application
> >>>>> as well. They always have.
> >>>>>
> >>>>> For me, a case of n=16 (16-bit words, 17-bit extended words) is
> >>>>> of particular practical interest.
> >>>>>
> >>>>> Brute force counting proves that both for (L=1,T=3) and for
> >>>>> (L=2,T=2) there exist more than 2**16 extended words with
> >>>>> desired properties. For those interested, 76735 and 83665
> >>>>> extended words, respectively. So, desired 16b-to-17b
> >>>>> conversions certainly exist for M=4. The problem is that I
> >>>>> know of no simple "algorithmic" way of implementing such
> >>>>> conversion short of using big look-up tables. Neither in
> >>>>> software nor in simple compact hardware. That is in strong
> >>>>> contrast to case of M=5 (both L=1,T=4 and L=2,T=3) where I know
> >>>>> many such algorithms.
> >>>>>
> >>>>> Any ideas?
> >>>>
> >>>> Please excuse my not following.. I don't quite see what you're
> >>>> getting at here. I get that you want to transform an arbitrary
> >>>> 16-bit value into a 17-bit value that satisfies a certain
> >>>> property (such that the 16-bit value can be reconstructed), but I
> >>>> don't get what the property is exactly or how a 17-bit value is
> >>>> transformed back into a 16-bit value. I do realize that what
> >>>> you're looking for is a nice way of turning a 16-bit value into a
> >>>> 17-bit value (assuming I'm not completely confused, which is
> >>>> always a possibility). Can you spell it out for me in a bit more
> >>>> detail?
> >>>
> >>> I thought it was rather obvious but rather obviously it was not :(
> >>> So let's do it again.
> >>>
> >>> 1. Reversibility.
> >>> No two 16-bit input patterns correspond to same 17-bit output
> >>> pattern. 2. Leading '1' bits.
> >>> Output patterns that have L+1 or more consecutive '1' MS bits are
> >>> illegal.
> >>> 3. Trailing '1' bits.
> >>> Output patterns that have T+1 or more consecutive '1' LS bits are
> >>> illegal.
> >>> 4. Inner '1' bits.
> >>> Output patterns that have L+T+1 or more consecutive '1' bits
> >>> starting at any position are illegal.
> >>>
> >>> (L,T) = either (1,3) or (2,2).
> >>
> >> Okay, I get it now.
> >>
> >> Here is an idea, such as it is. Clunky for software, might
> >> be okay for hardware.
> >>
> >> Subdivide the input space by looking at the top six bits. There
> >> are three ranges (with (L,T) = (2,2))
> >>
> >> 000000... to 100011... maps to 0xxxxx...
> >> 100100... to 110110... maps to 10xxxx...
> >> 110111... to 111111... maps to 110xxx...
> >>
> >> which are
> >>
> >> 36864 values of 47338 code points (77.87 percent)
> >> 19456 values of 24079 code points (80.80 percent)
> >> 9216 values of 12248 code points (75.24 percent)
> >> ===== =====
> >> 65536 83665
> >>
> >> Continue subdividing recursively, choosing nice value ranges in
> >> binary to correspond to subdivisions in the output space. After
> >> the first level of dividing, subdivisions have five output ranges
> >> to choose from rather than just three (until we reach the end, of
> >> course). For example, values in the range
> >>
> >> [ 100000.. to 100100.. ) could be mapped to 01110xxx and
> >> 011110xxx
> >>
> >> (4096 values of 3169 + 1612 = 4781 code points, 85.67 percent)
> >>
> >> All the above just playing around by hand after the initial idea
> >> of subdividing based on a small number of leading bits.
> >>
> >> Not sure if this approach is good enough for you. I admit it's
> >> not pretty. My sense is it's plausible, even if rather ugly.
> >> And who knows, it may spark a different idea or a better approach.
> >>
> >
> > I don't understand a "recursively" part.
> > May be, more detailed description of the second stage will help.
>
> Probably it won't, but here goes. :)
>
> First level decomposition (values on left, codes on the right). In
> all cases values are the lower end of a range.
>
> 0 000 00x xxx xxx xxx 0................
> 1 001 00x xxx xxx xxx 10...............
> 1 101 11x xxx xxx xxx 110..............
>
> We give an example of decomposing the third range. The third range
> has 14 code bits remaining, which gives 12248 codes available for use.
> The 12248 codes are subdivided as follows
>
> 0............. 6230
> 10............ 3169
> 110........... 1612
> 1110.......... 820
> 11110......... 417
>
> Second level decomposition of the third range above. First we use the
> last two prefixes to take care of the odd-shaped portion up to a value
> point of 1 110 000 000 000 000
>
> 1 101 11x xxx xxx xxx 1024 values with 1237 = 820+417 codes
> 0 000 xxx xxx 110 1110... 704 values of 820 codes
> 1 011 xxx xxx 110 11110.. 320 values of 417 codes
>
> (1 110 000 000 000 000 boundary)
>
> After nibbling off the odd-shaped portion, we are left with 8192
> values:
>
> 1 11x xxx xxx xxx xxx
>
> We subdivide these 8192 values among the first three prefixes
>
> 1 11x xxx xxx xxx xxx
> 0 xxx xxx xxx xxx 110 0...... 4096+512 = 4608 : 6230 codes
> 1 001 xxx xxx xxx 110 10..... 2048+512 = 2560 : 3169
> 1 11x xxx xxx xxx 110 110.... 1024 = 1024 : 1622
>
> (10 000 000 000 000 000 boundary)
>
> The principle being used here is to subdivide the value range along
> lines that are determined by high order bit positions, chosen so that
> each subrange maps onto a code point range that is roughly 4/3 the
> size of the value range being mapped.
>
> > The key question I don't understand is whether boundaries used at
> > the stage n+1 depend on the output bits produced at stage n or
> > boundaries applied to residue of the input word at given stage are
> > always the same.
> > If the former, it seems that we end up with the same 64 Kword table
> > size as in direct approach.
>
> There is dependence but it's of a different kind. What happens
> at stage n+1 depends on the upstream value bits. The tests
> are done at fixed bit positions, and the results combined by
> logic so there is a lot of sharing. I think the growth is
> more like linear than exponential.
>
> > If the later, it can be nice. But for that to work at each stage
> > you probably want to subdivide to sub-spaces with approximately
> > equal numbers of code points.
>
> My approach is to divide the value space into ranges whose sizes
> are roughly proportional to the set of code points for the code
> prefix used for that set of values.
>
> > So, may be, 01x, 01x and 1xx at first step?
> > Or just 0x and 1x, but then it degenerates into my current
> > "conversion to sub-binary non-constant base" algorithm.
> > And for my current algorithm I know for sure that "nice value
> > ranges" as you put it, can be used only at 5 initial stages. "Non
> > nice" ranges are required at stage 6 and later.
>
> Probably we mean different things by "nice value range". The approach
> I've been describing is not iterative, it's cascading logic.
>

Click here to read the complete article

On Fri, 8 Dec 2023 14:28:37 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 07/12/2023 16:39, Michael S wrote:
> > On Thu, 7 Dec 2023 13:01:40 +0100
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 06/12/2023 18:43, Michael S wrote:
> >>> On Wed, 6 Dec 2023 16:48:51 +0100
> >>> David Brown <david.brown@hesbynett.no> wrote:
> >>>
> >>>> On 06/12/2023 14:41, Michael S wrote:
> >>>>> On Wed, 6 Dec 2023 13:06:29 +0100
> >>>>> David Brown <david.brown@hesbynett.no> wrote:
> >>>>>
> >>
> >>>
> >>> You still refused to read my original post without skipping words.
> >>> In particular you obviously did not read this part "and at most
> >>> M=L+T continuous inner '1' bits".
> >>
> >> I did not "refuse" to do anything, but now that you have pointed
> >> this out, I see I misread a couple of things from the first post.
> >> Sorry for my confusion there, leading to some unnecessary concerns
> >> in my post as you had already covered the points.
> >>
> >>>>
> >>>>> Anyway, when somewhat higher density of transitions is
> >>>>> desirable, say, one transition (i.e. logical 1) per 25 or 30
> >>>>> bits, it's absolutely trivial to add at no additional coding
> >>>>> cost since 76735 is sufficiently large than 65736.
> >>>>
> >>>> Certainly you can make sure you have enough ones in the packet.
> >>>> (I would not say it is "trivial" until you have found a simple
> >>>> algorithm for coding and decoding to match your requirements -
> >>>> but it is unlikely to make the problem any harder.)
> >>>>
> >>>
> >>> In the mean time I found an algorithm that works. Sort of.
> >>> Modification that causes it to generate no more than 5 leading
> >>> zeros and no more than 16 trailing zeros per extended word is
> >>> indeed trivial.
> >>>
> >>> I am not fully satisfied with this algorithm for two reasons:
> >>> A. It's sort of ad hoc. It works for 16->17, but I am not sure
> >>> that similar algorithms are going to work for, say 24->25 or
> >>> 28->29. B. The hardware implementation (in FPGA) is not as
> >>> compact as I would like. There are too many non-trivial constants
> >>> involved and non-trivial constants consume LEs.
> >>>
> >>
> >> You are trying to implement this in a small number of LE's in an
> >> FPGA, right? I'm guessing you are looking at a linear shift
> >> register arrangement of some kind, since they are amenable to
> >> compact FPGA implementations (unlike a big lookup table - which
> >> would give a simple solution if you could afford the space).
> >>
> >
> > LFSR that does the work would be great. But I don't think that it is
> > even remotely possible. Or, may be, I am not enough of Galois
> > algebraist to think about how is it possible.
>
> Most of the Galois Field stuff I have done is in connection with
> redundancy and error correction for RAID, rather than this kind of
> thing. My thought is that some kind of scrambler pattern might be
> able to generate the patterns you need, or at least something close
> as a first step. But I have no idea how to look for such encodings
> other than by trial and error.
>

And suppose that by trial and error I found a polynomial that does a
decent job. Now how I am going to do a reverse transform? I mean,
without having algebraic understanding? It sounds like much harder task
than finding a scrambler :(

> > So far all my approaches were either logical (chained symbol
> > substitution) or arithmetic (conversion to numeric systems with
> > non-power-of-two base). In specific case of L+T=4 all algorithms
> > that I tried so far were of later variety.
>
> I can see how that could work, but I would expect it to be too big
> for what you want in your FPGAs?
>

A logical cores of circuits that do such transforms at rate one bit
per clock are rather small. Converter from binary to sub-binary base
is very similar to integer divider and reverse converter is very
similar to integer multiplier. Both of them belong to class of
shift-and-add algorithms.
The difference from DIV/MUL is that in DIV/MUL one of the arguments of
algorithm remains in register where in Numeric System transforms this
argument is delivered from look-up table. Compactness of both circuits
in FPGA depends mainly on depth and width of look up tables.

> > FPGAs, except of *very* old ones, are actually quite good at
> > addition and subtraction as long as your adder is not too wide.
> > 16-20 bits are always o.k.
> >
> >
> >> But it does all sound somewhat arbitrary here. Why do you pick
> >> these specific numbers? It seems you are not concerned about
> >> errors other than synchronisation (presumably you handle any
> >> required error checking or correcting at a higher level). Since
> >> you have a common clock source, are we talking about a pair of
> >> FPGAs on the same board here?
> >>
> >
> > Different boards in the same relatively small rack.
> > Transmitter size is not important, because there is only one
> > transmitter per (data acquisition) board.
> > Receiver size is relatively important, because there are many
> > receivers per (concentrator) board.
> >
>
> Interesting - that might allow for alternative ideas. So you'd be
> okay with needing something like lookup tables for the transmission
> encoding, as long as reception decoding can be small? (I had been
> assuming that you have more symmetrical hardware requirements.)
>

I guess I would be ok with that.
But by now I found a solution that is sufficiently compact on both
sides.

> >
> >> What sort of packet sizes are you using, and how critical is the
> >> bandwidth utilisation? Could you not massively simplify it by
> >> picking M as 16? Each 16 bits of data could be transferred as a
> >> frame consisting of a zero followed by the original data.
> >> Inter-packet gaps will be all ones, and you need a minimum gap of
> >> 17 ones before a new packet. Compared to your earlier suggestions
> >> this would increase the minimum gap time, but would make the
> >> implementation trivial.
> >>
> >>
> >> Of course if this is all motivated more by interest and the fun and
> >> challenge of trying to find such algorithms, rather than practical
> >> necessity, then I'll stop trying to look at the practicalities and
> >> try to think more about the coding patterns. Fun and interest is
> >> always a good reason.
> >>
> >
> > Most of your above questions/suggestions answered above in my reply
> > to BGB.
> >
>
> OK.
>

Michael S <already5chosen@yahoo.com> writes:

> On Fri, 08 Dec 2023 22:10:06 -0800
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

>> Probably we mean different things by "nice value range". The
>> approach I've been describing is not iterative, it's cascading
>> logic.
>
> May be.
> For me "nice value range" means that all values can be represented
> as M(i)*2**E(i) where min(M(i)) is small.
> At the end I looked for such nice values with brute force search.
> It turned out that there exist multiple sets that are nice enough
> for my purposes i.e. enable implementation of de-serializer that
> together in additional logic fits in 50 LEs.
> And finding it only took ~3 minutes of compute time on a single
> core of old PC.
> That's my current and likely final algorithm.
>
> static const unsigned bv_tab[17] = {
> 46, 47, 48, 49,
> 50, 51, 52, 53,
> 54, 55, 56, 56,
> 56, 64, 64, 64,
> 64, };
>
> // encode 16->17
> unsigned enc(unsigned x)
> {
> if (x < (1<<12))
> x += 0x10000; // avoid long strings of zeros
>
> // convert x to non-binary base
> unsigned y = 0;
> for (int i = 0; i < 17; ++i) {
> unsigned bv = bv_tab[i]*1024;
> y += y;
> if (x >= bv) {
> x -= bv;
> y += 1;
> }
> x += x;
> }
> return y & 0x1FFFF;
> }
>
> // decode 17->16
> unsigned dec(unsigned x)
> {
> // convert x from non-binary base to binary
> unsigned y = 0;
> for (int i = 0; i < 17; ++i) {
> y += y;
> if (x & 0x10000)
> y += bv_tab[i];
> x += x;
> }
> return (y >> 6) & 0xFFFF;
> }

This scheme certainly looks nicer than the idea I was describing.
Excellent!

The regularity of values in bv_tab[] makes it possible to get rid
of the array entirely. Here is a version of the encode function
that produces identical values to enc() above. Please ignore
minor differences in coding style; the important things to focus
on are the variables 'm' and 'd', with successive values of 'm'
taking on values of elements from the bv_tab[] array.

unsigned
encode_without_bvtab( unsigned x0 ){
unsigned x = x0 < 1<<12 ? x0 + 0x10000 : x0;
unsigned r = 0;
unsigned m = 46;
unsigned d = 011777; // note that this value is in octal

for( int i = 0; i < 17; i++, x += x, d >>= 1 ){
r <<= 1;
if( x >= m*1024 ) x -= m*1024, r |= 1;
m += (d&1) + (d==1)*7;
}

return r;
}

A similar transformation can be made to the dec() decoding function.

Re: Bit sequences with bounded number of continuous 1s.

<ul6fhm$339el$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=35674&group=comp.arch#35674

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.furie.org.uk!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Bit sequences with bounded number of continuous 1s.
Date: Mon, 11 Dec 2023 08:59:49 +0100
Organization: A noiseless patient Spider
Lines: 115
Message-ID: <ul6fhm$339el$1@dont-email.me>
References: <20231206123011.000031e8@yahoo.com> <ukpo46$ok6v$1@dont-email.me>
<20231206154145.000016a4@yahoo.com> <ukq553$qnva$1@dont-email.me>
<20231206194307.00003a7a@yahoo.com> <uksc75$18jkf$1@dont-email.me>
<20231207173901.000044bd@yahoo.com> <ukv5m5$1ojs1$1@dont-email.me>
<20231210151935.00000541@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Dec 2023 07:59:50 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0053e6c971fc74cbc6c57cd130b61b3a";
logging-data="3253717"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18h1jKP4vTIsAwDt0A6eLxHXgb1ImamGjI="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:XnCqYS47A7OHO/TVeIpQRbsrdCA=
In-Reply-To: <20231210151935.00000541@yahoo.com>
Content-Language: en-GB

by: David Brown - Mon, 11 Dec 2023 07:59 UTC

On 10/12/2023 14:19, Michael S wrote:
> On Fri, 8 Dec 2023 14:28:37 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 07/12/2023 16:39, Michael S wrote:
>>> On Thu, 7 Dec 2023 13:01:40 +0100
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>
>>>> On 06/12/2023 18:43, Michael S wrote:
>>>>> On Wed, 6 Dec 2023 16:48:51 +0100
>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>
>>>>>> On 06/12/2023 14:41, Michael S wrote:
>>>>>>> On Wed, 6 Dec 2023 13:06:29 +0100
>>>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>>>
>>>>

>>>
>>> LFSR that does the work would be great. But I don't think that it is
>>> even remotely possible. Or, may be, I am not enough of Galois
>>> algebraist to think about how is it possible.
>>
>> Most of the Galois Field stuff I have done is in connection with
>> redundancy and error correction for RAID, rather than this kind of
>> thing. My thought is that some kind of scrambler pattern might be
>> able to generate the patterns you need, or at least something close
>> as a first step. But I have no idea how to look for such encodings
>> other than by trial and error.
>>
>
> And suppose that by trial and error I found a polynomial that does a
> decent job. Now how I am going to do a reverse transform? I mean,
> without having algebraic understanding? It sounds like much harder task
> than finding a scrambler :(

This is not something I have done myself, so I might be getting things
very wrong. But with a digital scrambler, you are xor'ing the data
sequence with a pseudo-random generated sequence, giving a result that
is statistically noise with very close to 50% one/zero ratio.

To get the data back, you simply scramble again with the same
pseudo-random sequence. As long as your data and pseudo-random
sequences are synchronised, it's all easy - scrambling and unscrambling
are the same thing.

At least, that's the theory as far as I understand it.

(Real data is always biased in its original form - typically far more
zeros than ones. Though if the data is encrypted or compressed, it will
be much more like unbiased noise.)

None of this helps find a scrambling algorithm or polynomial that meets
your needs, but I /think/ it means it would work well if you found such
a scrambler.

>
>>> So far all my approaches were either logical (chained symbol
>>> substitution) or arithmetic (conversion to numeric systems with
>>> non-power-of-two base). In specific case of L+T=4 all algorithms
>>> that I tried so far were of later variety.
>>
>> I can see how that could work, but I would expect it to be too big
>> for what you want in your FPGAs?
>>
>
> A logical cores of circuits that do such transforms at rate one bit
> per clock are rather small. Converter from binary to sub-binary base
> is very similar to integer divider and reverse converter is very
> similar to integer multiplier. Both of them belong to class of
> shift-and-add algorithms.

I keep thinking of division as being big and/or slow (in comparison to
multiplication, which is often supported single-cycle in hardware blocks
of FPGAs) but of course when you don't need more than one bit out per
clock, it's not bad that bad.

> The difference from DIV/MUL is that in DIV/MUL one of the arguments of
> algorithm remains in register where in Numeric System transforms this
> argument is delivered from look-up table. Compactness of both circuits
> in FPGA depends mainly on depth and width of look up tables.
>
>>> FPGAs, except of *very* old ones, are actually quite good at
>>> addition and subtraction as long as your adder is not too wide.
>>> 16-20 bits are always o.k.
>>>
>>>
>>>> But it does all sound somewhat arbitrary here. Why do you pick
>>>> these specific numbers? It seems you are not concerned about
>>>> errors other than synchronisation (presumably you handle any
>>>> required error checking or correcting at a higher level). Since
>>>> you have a common clock source, are we talking about a pair of
>>>> FPGAs on the same board here?
>>>>
>>>
>>> Different boards in the same relatively small rack.
>>> Transmitter size is not important, because there is only one
>>> transmitter per (data acquisition) board.
>>> Receiver size is relatively important, because there are many
>>> receivers per (concentrator) board.
>>>
>>
>> Interesting - that might allow for alternative ideas. So you'd be
>> okay with needing something like lookup tables for the transmission
>> encoding, as long as reception decoding can be small? (I had been
>> assuming that you have more symmetrical hardware requirements.)
>>
>
> I guess I would be ok with that.
> But by now I found a solution that is sufficiently compact on both
> sides.

Great! Everything beyond that is just for fun :-)

On Mon, 11 Dec 2023 08:59:49 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 10/12/2023 14:19, Michael S wrote:
> > On Fri, 8 Dec 2023 14:28:37 +0100
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 07/12/2023 16:39, Michael S wrote:
> >>> On Thu, 7 Dec 2023 13:01:40 +0100
> >>> David Brown <david.brown@hesbynett.no> wrote:
> >>>
> >>>> On 06/12/2023 18:43, Michael S wrote:
> >>>>> On Wed, 6 Dec 2023 16:48:51 +0100
> >>>>> David Brown <david.brown@hesbynett.no> wrote:
> >>>>>
> >>>>>> On 06/12/2023 14:41, Michael S wrote:
> >>>>>>> On Wed, 6 Dec 2023 13:06:29 +0100
> >>>>>>> David Brown <david.brown@hesbynett.no> wrote:
> >>>>>>>
> >>>>
>
> >>>
> >>> LFSR that does the work would be great. But I don't think that it
> >>> is even remotely possible. Or, may be, I am not enough of Galois
> >>> algebraist to think about how is it possible.
> >>
> >> Most of the Galois Field stuff I have done is in connection with
> >> redundancy and error correction for RAID, rather than this kind of
> >> thing. My thought is that some kind of scrambler pattern might be
> >> able to generate the patterns you need, or at least something close
> >> as a first step. But I have no idea how to look for such encodings
> >> other than by trial and error.
> >>
> >
> > And suppose that by trial and error I found a polynomial that does a
> > decent job. Now how I am going to do a reverse transform? I mean,
> > without having algebraic understanding? It sounds like much harder
> > task than finding a scrambler :(
>
> This is not something I have done myself, so I might be getting
> things very wrong. But with a digital scrambler, you are xor'ing the
> data sequence with a pseudo-random generated sequence, giving a
> result that is statistically noise with very close to 50% one/zero
> ratio.
>
> To get the data back, you simply scramble again with the same
> pseudo-random sequence. As long as your data and pseudo-random
> sequences are synchronised, it's all easy - scrambling and
> unscrambling are the same thing.
>
> At least, that's the theory as far as I understand it.
>
> (Real data is always biased in its original form - typically far more
> zeros than ones. Though if the data is encrypted or compressed, it
> will be much more like unbiased noise.)
>
> None of this helps find a scrambling algorithm or polynomial that
> meets your needs, but I /think/ it means it would work well if you
> found such a scrambler.
>

Scrambler of this type is guaranteed to not work at all for my purpose.

All it achieves is making output string to look like random sequence.
And random sequence has approximately 35% probability to contains 4
consecutive '1' bits per 16-bit or 17-bit word. Way above 50%
probability for my smallest 64-bit packet.
Scrambler like thhat can not even take advantage of additional bit that
I am willing to add per every 16 bits.
It is believable that scramblers of this sort are great for reducing
probability of *long* bursts of 1s (or zeros) down to insignificance.
Say, bursts of 40 repetitions. Or, at least, of 25 repetitions. But not
of 4 or 5 or 10. Not even of 16.

> >
> >>> So far all my approaches were either logical (chained symbol
> >>> substitution) or arithmetic (conversion to numeric systems with
> >>> non-power-of-two base). In specific case of L+T=4 all algorithms
> >>> that I tried so far were of later variety.
> >>
> >> I can see how that could work, but I would expect it to be too big
> >> for what you want in your FPGAs?
> >>
> >
> > A logical cores of circuits that do such transforms at rate one bit
> > per clock are rather small. Converter from binary to sub-binary base
> > is very similar to integer divider and reverse converter is very
> > similar to integer multiplier. Both of them belong to class of
> > shift-and-add algorithms.
>
> I keep thinking of division as being big and/or slow (in comparison
> to multiplication, which is often supported single-cycle in hardware
> blocks of FPGAs) but of course when you don't need more than one bit
> out per clock, it's not bad that bad.
>

In FPGA, at 1 bit per clock and without using HW multipliers (which are
of little use for 1 bit per clock, anyway) I am not sure at all that
multiplier would be smaller than divider. More likely, divider would be
smaller, but a little (few percents) slower in terms of achievable
clock frequency.

> > The difference from DIV/MUL is that in DIV/MUL one of the arguments
> > of algorithm remains in register where in Numeric System transforms
> > this argument is delivered from look-up table. Compactness of both
> > circuits in FPGA depends mainly on depth and width of look up
> > tables.
> >>> FPGAs, except of *very* old ones, are actually quite good at
> >>> addition and subtraction as long as your adder is not too wide.
> >>> 16-20 bits are always o.k.
> >>>
> >>>
> >>>> But it does all sound somewhat arbitrary here. Why do you pick
> >>>> these specific numbers? It seems you are not concerned about
> >>>> errors other than synchronisation (presumably you handle any
> >>>> required error checking or correcting at a higher level). Since
> >>>> you have a common clock source, are we talking about a pair of
> >>>> FPGAs on the same board here?
> >>>>
> >>>
> >>> Different boards in the same relatively small rack.
> >>> Transmitter size is not important, because there is only one
> >>> transmitter per (data acquisition) board.
> >>> Receiver size is relatively important, because there are many
> >>> receivers per (concentrator) board.
> >>>
> >>
> >> Interesting - that might allow for alternative ideas. So you'd be
> >> okay with needing something like lookup tables for the transmission
> >> encoding, as long as reception decoding can be small? (I had been
> >> assuming that you have more symmetrical hardware requirements.)
> >>
> >
> > I guess I would be ok with that.
> > But by now I found a solution that is sufficiently compact on both
> > sides.
>
> Great! Everything beyond that is just for fun :-)
>
>

Subject	Author
Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	David Brown
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Terje Mathisen
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Stephen Fuld
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	MitchAlsup
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Tim Rentsch
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	BGB
Re: Bit sequences with bounded number of continuous 1s.	Michael S
Re: Bit sequences with bounded number of continuous 1s.	Scott Lurndal
Re: Bit sequences with bounded number of continuous 1s.	Robert Finch
Re: Bit sequences with bounded number of continuous 1s.	Chris M. Thomasson