LeetCode 137 is to find the number which apprears exactly once in an array assuming other numbers always appear exactly three times. It's very easy to come up with a linear runtime algorithm by using hash table which takes O(n) space. A better approach is to modulo count the number of occurrences of 1 at every bit and compose the final result based on this information. But to implement it efficiently, some bitwise tricks are used which remind me some good old college time when I was learning Digital Circuit and CPU Architeture. So I think I should take these notes down here. Also, this problem could be generalize to find the number which apprears exactly p times in an array while others always appear k times, assuming (p < k). And for LeetCode 137, there is also a two-line solution based on the special parameters (k, p) = (3, 1). I will talk about it at the end.

First, we can use hash table to count the appearance of every number and then check which number's count is exactly one. This can give a linear runtime algorithm having linear space complexity. Some people argues that this is gonna be practically slow because hashing is an expensive operation. But for this special problem, we can use a simple modulo function to compute the hash number. But one thing to note is that although we can choose a prime large enough to give extremely low collision rate, this algorithm is not strictly guaranteed to be always correct.

Another idea is that we can count the occurence of 1 at every bit of the
numbers. The difference is that whenever the count reaches k we reset it to 0
and carry on. At the end, since all numbers except one appear k times, the final
value of the count is actually p or 0. We can easily construct the number itself
based on this information. The only thing here is how to implement these
counters efficiently. A trivial implementation could use an array and calculate
every bit of every number and count it accordingly. Definitely, the time
complexity will be `O(nlog(MAX_INT))`

and the space complexity
`O(log(MAX_INT))`

. But by using some bitwise tricks, we can change these two
metrics to have a `log(k)`

scale factor. When k is less than MAX_INT, we can
have an algorithm with better asymptotic performance.

For simplicity, we will assume the numbers are 32 bits long. So we need 32 counters in total. The key observation here is that if k is of m bits, we are having a (32, m) matrix of 1-bit value. The above algorithm we mentioned are updating the counters from a row based point of view. But we can also update it in a column based way.

```
m...321 count n1 n2 n3 ... nk ...
1 ....... <- . . . .
2 ....... <- . . . .
3 ....... <- . . . .
. . . . . . .
. . . . . . .
32 ....... <- . . . .
```

If we can found a way to update `counter[i][1]`

..`counter[i][m]`

bitwisely with
1-bit number n. Then we can pack the updates of all 32 counters at an
integer. We can see this technique as a bitwise level parallel computation. But
how can we update `counter[i][1..m]`

bitwisely? For simplicity, we use xm...x1
to represent the counter. There are two things we need to work out

- How to add 1 to it
- How to reset it to 0 when it reaches k

The reset part is relative easy. If you have experience with circuit design, it is easily to see that we can check if k reaches by compute a value out of xm..x1. The fundamental rule of circuit design or digital logic is that you can work out a mapping between any bit pattern once you have written down the truth table. Assume k = 0b101, we are expecting following truth table for reset

```
x 321 -> x 321
000 000
001 001
010 010
011 011
100 100
k 101 000
```

We can see only when `x3x2x1 == 101`

, `x3 & (~x2) & (x1) == 1`

. Then the reset
could be expressed as

```
mask = ~(x3 & (~x2) & x1)
x3 = x3 & mask
x2 = x2 & mask
x1 = x1 & mask
```

To work out the add part, we start with x1. Because for x1, this is no carry. We are expecting following truth table

```
x 0 n -> x 0
0 0 0
1 0 1
0 1 1
1 1 0
```

The mapping could be written out directly as ```
x0 = ((x0) & (~n)) | ((~x0) & (n))
= x0 ^ n
```

. Perhaps, this is why XOR sign is chosen to be a circle plus, just to
remind you of its ability to act as a circle around in-place plus without carry
for circuit design.

To calculate x2..xm, we need to take carry into consideration.

```
x i carry -> x i
0 0 0
1 0 1
0 1 1
1 1 0
```

Actually, it's the same table for x0. So we can write out `xi = xi ^ carry_i`

.

But how to compute carry? Just think when we will have a carry for xi. Only when xi-1, ..., x1, and n are all 1. This leads to

```
carry_i = xi-1 & .. & x1 & n
```

Then we can write out the code to update these counters. Note, we generalize the compuatation of x0 by using n as carry0

```
x = [0 for _ in xrange(m)]
k = bin(k)[2:][::-1]
for n in nums:
carry = n
for i in xrange(m):
x[i], carry = x[i] ^ carry, x[i] & carry
mask = -1
for i in xrange(m):
mi = x[i] if k[i] == '1' else ~x[i]
mask = mask & mi
mask = ~mask
for i in xrange(m):
x[i] = x[i] & mask
the_number = 0
for i in xrange(m):
the_number = the_number | x[i]
return the_number
```

At the end of above code, we add a loop to compute the single number. Since its final occurence count of '1' at any bit is either p or 0, the bit value at ith bit of the single number can be computed the single number by

```
the_number = xm | ... | x1
```

Besides the general solution, there could also be a specialized solution for specific k. If we know any specific k for sure, we can do better. As in LeetCode 137, k = 3. The truth table to update count is

```
x 10 n -> x 10
00 0 00
01 0 01 .
10 0 10
00 1 01 .
01 1 10
10 1 00
```

Here comes the true beauty of digital logic. We can write out the mapping directly as

```
x0 = ((~x1) & x0 & (~n)) | ((~x1) & (~x0) & n)
= (~x1) & ((x0 & (~n)) | ((~x0) & n))
= (~x1) & (x0 ^ n)
```

You can save the original value of x0, and then use it to calculate x1 in the smae way above. But we can also use the computed value of x0 as a new input to calculate x1. Namely, we substitute the left x0 column with the right x0 column in above truth table. Then we have

```
x 10 n -> x 10
00 0 00
01 0 01
10 0 . 10
01 1 01
00 1 . 10
10 1 00
```

Remeber, the true beauty. We have

```
x1 = (x1 & (~x0) & (~n)) | ((~x1) & (~x0) & n)
= (~x0) & ((x1 & (~n)) | (~x1) & n)
= (~x0) & (x1 ^ n)
```

Then we have following lines of magic to solve LeetCode 137. Note p = 1. It means x1 will always be 0 at the end.

```
x = [0, 0]
for n in nums:
x[0] = (~x[1]) & (x[0] ^ n)
x[1] = (~x[0]) & (x[1] ^ n)
return x[0]
```

At the end of this post, take a while to savor the true beauty of digital logic. It could be singular as well as single.