Find a bit state in a long patter

First post, learning Python, not a pro developer but have done a few small project.

I have a pattern as integer 147056391 bin 1000110000111110011100000111
This are repeating every 28 days, start date 2022-01-01, user work schedule, user works on January 1,2,3, Off on January 4,5,6,7,8 working on January 9,10,11 and so on.
My current approach (working) is below, what i’m looking for is a faster or better way looking for a specific date if a further distance without looping through from 2022-01-01, like what bit position would be in day 688 or July-22-2024 ? without a long loop
Any advice please…

def findInPattern(start, end, pattern, repeat_after)
     pattern = pattern  # 1#147056391  # 2#163701519  # 3#121379064  # 4#104733936
             delta = end - start
             temp = list()
             length = int(delta.days / repeat_after)
             length += length % repeat_after
     for j in range(0, length):
             for i in range(0, repeat_after):
             if pattern & (1 << i):
                    onoff = "Work"
                    className = "blue"
                else:
                    onoff = "Off"
                    className = "green"
             temp.append(className)

      return className

would this work ? or reasonable ?

 pattern << 688
 pattern & (1 << 0)
 

This works

INT_BITS = 32
def leftRotate(n, d):
    if d > INT_BITS:
        b = int(d/INT_BITS)
        r = d % INT_BITS
        d = 32
        for i in range(0, b):
            n = (n << d | (n >> (INT_BITS-d))) & 0xFFFFFFFF
        n = (n << r | (n >> (INT_BITS-r))) & 0xFFFFFFFF
    else:
        n = (n << d | (n >> (INT_BITS-d))) & 0xFFFFFFFF
    return n

number = 145547821
e = leftRotate(number, 688)

Why are you using bit flags? That’s very obfuscated and hard to read and understand, and not very Pythonic.

If the pattern repeats every 28 days, then:

  • work out how many days from 2022-01-01 to the day you want;

  • take the remainder when divided by 28;

  • inspect the bit in that position.

So to work out the bit flag for 17 March 2022:

# Untested.
from datetime import date
days = (date(2022, 3, 17) - date(2022, 01, 01)).days
rem = days % 28
flag = (pattern >> rem) & 1
1 Like

@csaba911, your situation will be easier to work with if you post minimal code rather than a whole function definition with no sample parameters for the arguments. I tried to follow it when you first posted, but I quit because the function code adds so much more complexity beyond the problem you’re asking about. After @steven.daprano stripped it down, I now understand what you would like to do and can offer some advice.

The first part of the design is the data structure (currently all in ‘pattern’). It looks like you would like to use a dictionary of 28-day periods. (The ‘#’ symbol is usually a comment marker, so I can’t tell what it’s supposed to do in your code.)

Let’s set the dictionary aside for the moment and work on how to get a single bit value from the integer in ‘pattern’.

pattern = 147056391
dayInPeriod = 9

bits = "{:b}".format(pattern)    # 'bits' is a string of the bit values in 'pattern'
flag = bits[dayInPeriod-1]       # sets 'flag' to the value of the nth bit in 'bits', where n = 'dayInPeriod'
                                 # Note that the value in 'flag' is a string.
                                 # Note that the string index is zero-based.

If you would like a breakdown of the "{:b}".format(pattern) line, please let me know.

Now that we have code to read the bit value, we just need to index through the periods and process the target period. Since you use a 'start' and 'end' date range, Modulo is definitely the way to go here.

from datetime import date
from random import randrange

periodLen = 28
start = date(2022,1,1)  # to use a loop, move 'end' and 'dayRange' to the top of the looped code
end = date(2022,3,randrange(1,28))
dayRange = (end - start).days
dataset = {1:147056391,
           2:163701519,
           3:121379064,
           4:104733936}

periodNum = int(dayRange/periodLen)
dayNum = dayRange % periodLen
intPeriodVal = dataset[periodNum]
bulPeriodVal = "{:b}".format(intPeriodVal)
dayVal = bulPeriodVal[dayNum-1]     # subtract (1) for zero-based index

print(f"dayNum: {dayNum} is '{dayVal}'")   # 'print()' not necessary here but it lines everything up.
print("....:....|....:....|....:....|")
print(bulPeriodVal)
print((" " * (dayNum-1)) + dayVal)

You can see that the day part of the date is randomly generated. I tested it thoroughly with a loop and the block of print() statements at the end. The period lookup wasn’t directly tested but the random day-of-month rolls back and forth between two periods and the output was fine.
How well does this solution work for you?

Converting the int to a base-2 string just to extract the bits is wasteful. Converting to a string is about five times as costly as bit twiddling.

from timeit import Timer
pattern = 13528694
t1 = Timer('"{:08b}".format(pattern)[7]', 'from __main__ import pattern')
t2 = Timer('(pattern >> 7) & 1', 'from __main__ import pattern')
print(min(t1.repeat()))  # Using string indexing.
print(min(t2.repeat()))  # Using bit twiddling.

On my PC, that prints (approx) 0.318 for the string indexing, and 0.065 for the bit twiddling. So on my PC at least, the string indexing version is way slower.

Also, as you point out, extracting the bits with indexing gives you a string ‘0’ or ‘1’, but it is easy to mistakenly think that they are integer values 0 or 1 and write something like this:

flag = bits[dayInPeriod]
if flag:
    print("go to work!")
else:
    print("enjoy your day off")

but the string ‘0’ is considered a truthy value, not false. To use it as a flag, you have to convert it to an int.

1 Like

Wow. I expected the string conversion to be slower, but am surprised that it’s that much slower. Thanks for running the check.

I posted that solution because directly indexing the bit chain is about as self-evident as it gets (and is “obvious” at least in its transparency, though a fast algorithm would be “obvious” where speed is critical). The bit shift is a great approach, though definitely not self-evident unless you’re used to bit twiddling.

I assume the bit twiddle also wins against an exponential calculation using the target period’s day index (2**dayVal) followed by a BitwiseAND (&). Will check…

[EDIT] An additional time hit for strings: the string indexing is in the opposite direction to bit ascension, so a reversal is needed. Simply negating the index and subtracting ‘1’ will do it, but that’s two additional operations. Also- the binary exponent mask will need a consolidation/evaluation step once we have a single ‘1’ embedded in a set of '0’s.

Running in VS Code, I got these results:

1 1 True
 strIdx: 0.328
bitShft: 0.075
 binExp: 0.059

from this code:

pattern = 13528694
t1 = Timer('"{:08b}".format(pattern)[-5-1]', 'from __main__ import pattern')
t2 = Timer('(pattern >> 5) & 1', 'from __main__ import pattern')
t3 = Timer('(pattern & 2**5)>0', 'from __main__ import pattern')

print("{:b}".format(pattern)[-5-1],(pattern >> 5) & 1,(pattern & 2**5)>0)

print(f"string: {min(t1.repeat())}")
print(f"   bit: {min(t2.repeat())}")
print(f"binExp: {min(t3.repeat())}")

Without the ‘>’ operation, the binary exponent method is quite a bit faster (0.038). There’s probably a faster consolidation than ‘>’. One bonus to the GreaterThan is that it returns a bool, which can be natively used in the IF: clause without a separate evaluation to produce a bool.

I’d say the lesson is that speed comes at a cost of readability, as usual.

Found python handling the ajax request and the data preparation of 10 user toke .800 ms average.
Now I’m just returning the basic user info and the pattern as int and create the dataSet in vanilla JS in the browser, now I can scroll smooth with under 200ms latency.

Thanks for the input from all, many god examples.

Ended up using the simplest approach.

dateDelta % PatternLength

Glad you found a solution you like, Csaba.

[UPDATE: ]
Replacing the…
binExp > 0
with…
not(binExp == 0)
…is slightly faster than bitShift. Over 100 cycles, binExp>0 is still the fastest of these:

 idxStr: 0.4368
bitShft: 0.1482
binExpN: 0.1425
binExpG: 0.1304
# test code for binExpG:
t4 = Timer('not(pattern & 2**7 == 0)', 'from __main__ import pattern')
print("binExpG:",(min(t4.repeat(100))))

It is not very usual in Python to use bit array data structure. Usually it is used in lower-level languages, for some type of large data or for optimization for memory storage and speed.

If you plan to perform more operations on bit arrays, you can check existing libraries. Example:

This one is implemented in C so it should be fast. It implements for example the indexing operator [] including slicing. There are examples in the description.

It’s all good now, I was approaching my problem from the wrong angle.
Thanks for the suggestion though :blush: