math
bit manipulation
]
Leetcode 1680. Concatenation of Consecutive Binary Numbers
https://leetcode.com/problems/concatenation-of-consecutive-binary-numbers
One way to solve this problem in competition is just create very long binary string and then find result of division by 10**9 + 7
and it will work fine. However in interview setup it is not the best idea and I prefer more honest way. Let us look at first several n
to see how it works:
1
110
11011
11011100
Let us try to find answer to n
, using information about n-1
. 110
= 1 * 100 + 10
(all in binary representation), 11011 = 110 * 100 + 11
, 11011100 = 11011 * 1000 + 100
and so on. We can see that on each step we need to multiply number by lenght of new number and add new number (and use %M
) and that is all!
Complexity: time complexity is O(n)
, space is O(1)
.
class Solution:
def concatenatedBinary(self, n):
ans, M = 0, 10**9 + 7
for x in range(n):
ans = (ans * (1 << (len(bin(x+1)) - 2)) + x+1) % M
return ans
Further thoughts When you found O(n)
solution, you should ask a question, can you do better? Here I had very strong intuition that you can do it faster and I came up with O(log^2 n)
solution. Then I checked discussions and there is in fact at least two solutions with this complexity: one uses idea of geometrical progressions: idea behind is the following: we can see that in binary representation of desired number there will be O(log n)
sequences of ones with equal gaps for each of O(log n)
lengths of numbers. You can find this solution in discussion, I will code my own version if I have time. Another solution using matrix multiplications and I was very impressed to see it.
Faster solution with O(log^2 n) complexity
It is easier to chose some n
and go into details about this n
. Let us choose n = 54
and discuss how my algorithm works:
Function bin_pow
will return powers of 2
, which sum to given number, for example bin_pow(37) = [1, 4, 32]
, because 1 + 4 + 32 = 37
.
Now, we have n = 54
, and we have:
1
number with length1
: just1
2
numbers with length2
:10, 11
4
numbers with length3
:100, 101, 110, 111
.8
numbers with length4
:1000, ..., 1111
16
numbers with length5
:10000, ..., 11111
. Now, we have 23 more numbers and we keep to split them into groups:16
numbers with lenght6
, which start with10....
, that is100000, 100001, ... 101111
. Why we choose16
of them? Because it is the biggest group we can take such that this group is full: given start we have all possible endings.4
numbers with length6
, which start with1100
, that is110000, 110001, 110010, 110011
2
numbers with length6
, which start with11010
, that is110100
and110101
.1
number with length6
:110110
, which is equal to54
, our last number.
We will keep lenghts of our groups in list B
, so we have:
B = [1, 2, 4, 8, 16, 16, 4, 2, 1
]
We will keep length of numbers in each group in list C
, so we have:
C = [1, 2, 3, 4, 5, 6, 6, 6, 6]
So far we have the following picture:
[1][10 11][100 101 110 111] ... [110000, 110001, 110010, 110011], [110100, 110101], [110110]
We will keep starting places of each group in list D
, in our case we have:
D = [266, 262, 250, 218, 138, 42, 18, 6, 0]
.
Let us look from the end: last group we do not need to multiply by anything, previous group has 1
number with length 6
, so we need to multiply it by 2^6
. Next we have 2*6 + 1*6
shift, 4*6 + 2*6 + 1*6
, 16*6 + 4*6 + 2*6 + 1*6
, 16*5 + 16*6 + 4*6 + 2*6 + 1*6
and so on: it can be done efficiently, using accumulate
function in python.
Final step is to iterate over our groups and evaluate number modulo 10**9 + 7
in each group:
Let us look at group [110000, 110001, 110010, 110011]
for better underastnding. This group will be somewhere in the middle of number, like ...[110000, 110001, 110010, 110011]...
and we need to understand what impact it has on our number, we need several values:
- Place, where this group is located, for this we use
D
list,d
for given group. - Lenth of elements in each group, for this we use
C
list,c
for given group. - Number of elements in our group, for this we use
B
list,b
for given group. - Also we need to now, from which element we start, in our case it is
110000
, we can evaluate it asa - b + 1
, wherea
is corresponding element inaccumulate(B)
: here we haveaccumulate(B) = [1, 3, 7, 15, 31, 47, 51, 53, 54]
and51-4 + 1 = 48
is current number we start with,a
for given group
Finally, we need to do some mathematics: we need to evaluate sum:
[(a-b+1) * 2^(b*c) + (a-b+2)*2^(b*(c-1)) + ... + (a-b + c-1) * 2^(b*1) + (a-b + c) * 2^(b*0)]*2^d
.
This formula can be written in closed form! It is almost like sum of geometrical progression, but in fact, sum of several of them which lead us to closed form solution. We also need to work using modular arithmetic, so we use powerful pow
python function. In my notation t1 = [2^(bc) - 1] % MOD
and t2 = [1/(2^c - 1)] % MOD
. Here we use Fermat’s little theorem to evaluate inverse number in modular arithmetic.
Complexity: time complexity of evaluating B, C, D
lists is just O(log n)
. However when we work with pow
function, complexity to evaluate sum in each group will be also O(log n)
, which leasd to O(log^2 n)
time complexity. Space complexity is O(log n)
.
class Solution:
def concatenatedBinary(self, n):
def bin_pow(num): return [1<<i for i, b in enumerate(bin(num)[:1:-1]) if b == "1"]
ans, MOD, q = 0, 10**9 + 7, len(bin(n)) - 3
B = bin_pow((1<<q) - 1) + bin_pow(n - (1<<q) + 1)[::-1]
C = list(range(1, q+1)) + [q+1]*(len(B) - q)
D = list(accumulate(i*j for i,j in zip(B[::-1], C[::-1])))[::-1][1:] + [0]
for a, b, c, d in zip(accumulate(B), B, C, D):
t1 = pow(2, b*c, MOD) - 1
t2 = pow(pow(2, c, MOD)-1, MOD - 2, MOD)
ans += t2*((a-b+1+t2)*t1-b)*pow(2, d, MOD)
return ans % MOD
If you like the solution, you can upvote it on leetcode discussion section: Problem 1680