| xie's profile追忆似水流年PhotosBlogLists | Help |
|
August 26 手算开平方
算法分析导论前段时间叫平姐借本书解解闷,谁知道拿了本超级难的数学性质的书,操。这本书基本上看不懂,只有两地方看得懂,圈和整数分解。圈引出两个东西,floyd圈检测和原位排列。floyd检测就是两个步长不同的遍历,如果有圈存在,那么他们肯定会碰撞。原位排列In-place permuration就是利用排列中圈的存在,像多米诺骨牌那样逐个替换,这个算法可以达到O(N+lnN),厉害。整数分解是关于Pollard的高概率分解整数,利用了floyd圈检测。p方法用到函数,f(x)=(x^2 +c)mod N, f(f(x))=(x^2 +c)(x^2 +c) + c mod N,此算法的时间复杂度为O(n^(1/4)),下面贴伪代码 ------------------------------------------------------------------------------ function factor(N: integer): integer; var a, b, c, d: integer; begin a := randominteger(0, N-1); b := a; c := randominteger(0, N-1); repeat a := (a*a +c) mod N; b := (b*b +c)(b*b +c) +c mod N; d := gcd(a-b mod N, N); until (d<>1); factor := d; end; -------------------------------------------------------------------------------- 上面所说的gcd就是最大公约数,用辗转相除法就能很快解出。下面贴出原位排列的伪代码 -------------------------------------------------------------------------------- for i :=1 to N do if p[i] <> i then begin t := a[i]; k :=i; repeat j := k; a[j] := a[p[j]]; k := p[j]; p[j] := j; until k = i; a[j] := t; end; end; August 19 古老的面试题一位老兄在天涯上面提了一个算法题,给人取笑了,呵呵
说是有一大串数字,你又不知道有多少,内存肯定装不下,叫你随机输出其中一个数。
答案:你有一个integer stream,每个元素都是unique的
定义一x 然后开始接受这个stream... 第一个 1/1 几率保存到 x 第二个 1/2 几率保存到 x 第三个 1/3 几率保存到 x 第k格 1/k 几率保存到 x ........ 最后输出x... 可以在不知道n的情况下,保证每个元素几率均等的机会输出。。 August 18 旋转数组Q:给一个n元数组
,和k,使用常熟空间和线性时间,使得数组变成 A:
O(n):
a(1),a(2),,,a(k-1),a(k),,,a(n)-> a(k-1),,,a(2),a(1),a(n),,,a(k)-> a(k),,,a(n),a(1),a(2),,,a(k-1) ---------------------------------------------------------------------------------------
Q:在一个旋转过的有序数组(比如3 4 5 1 2)上实现二分查找。
A:
example: b = 5 19 20 4 5 6 7 8 9 10 a_0 > a_n > a_m > b --> b sits in the left partial 19 20 4 5 6 a_0 > a_n > b > a_m --> b sits in the right partial 4 5 6 a_n == b August 12 Max square martix Imagine there is a square matrix with n x n cells. Each cell is either filled with a black pixel or a white pixel. Design an algorithm to find the maximum subsquare such that all four borders are filled with black pixels;
Can this be done in O(n^2) ?
Here's an O(n^2) solution. It can probably be cleaned up and made more elegant, but as a proof of concept it will do. Although the algorithm isn't very complicated, the explanation may appear so. It's the best I could do. I apologize.
1. Reduction to a 1-dimensional problem.
The upper-left and lower-right corners of any square will both be situated on one of the matrix's diagonals. We therefore consider each of the 2n-1 diagonals, performing the same linear time algorithm each. (Note that the length of each diagonal is at most n.) We then select the maximum result obtained in these 2n-1 invocations. 2. Preprocessing.
In order to be able to run the diagonal algorithm efficiently, we first compute the following. For each matrix entry (i,j), we compute R[i,j] as the (maximum) number of contiguous black entries on the ith row, starting at column j and going right. We similarly compute D[i,j], which counts black squares vertically downwards. We then compute F[i,j]=min{R[i,j],D[i,j]}. For example, (using + for black and = for white --- I'm hoping this will give a monospace effect) if the matrix is: =+++=+= ++===+= =+=++== +++++== ++++=++ ======= +++++++ Then R is:
0321010 2100010 0102100 5432100 4321021 0000000 7654321, D is:
0511020 1400010 0303200 2222100 1111011 0000000 1111111, and F is:
0311010 1100010 0102100 2222100 1111011 0000000 1111111 We are only really interested in F. The meaning of F is: the maximum side-length of a square whose upper-left corner is (i,j) and whose upper and left borders are black is F[i,j].
The computation of R, D, and F is a straightforward dynamic programming exercise, and takes O(n^2) time.
In a symmetrical manner we compute in O(n^2) time a matrix B whose meaning is: the maximum side-length of a square whose lower-right corner is (i,j) and whose lower and right borders are black is B[i,j].
The key point is that a square whose upper-left corner is (i,j) and whose lower right corner is (i+s-1,j+s-1) (where s>=1) is black-bordered iff the F[i,j]>=s and B[i+s-1,j+s-1]>=s.
3. Preprocessing for a given diagonal.
Given a diagonal, we copy the corresponding diagonal out of F and B and obtain two one-dimensional arrays, which I will also denote by F and B. A square whose diagonally opposing corners are at positions i and j (where i <= j) down the diagonal is black-bordered iff F[i] >= j-i+1 and B[j] >= j-i+1. Let us transform F and B by: F[i] <- i+F[i]-1; B[i] <- i-B[i]+1; for all i. With these new F and B, a square whose diagonally opposing corners are at positions i and j down the diagonal (where i<=j) is black bordered iff F[i]>=j and B[j]<=i. (Of course it is not necessary to actually copy out the diagonals from the 2-dimensional arrays. We can access them in place instead. Also, instead of computing F and B as described and then transforming them as above, we can have the dynamic programming compute the transformed values directly.)
4. The diagonal algorithm.
So the one-dimensional problem we wish to tackle is the following. Given two arrays F and B, both of length t (where t <= n) and both containing non-negative integers, find a *good* pair (i,j) that maximizes j-i. A pair (i,j) is said to be good if 1 <= i <= j <= t, F[i] >= j, and B[j] <= i. Here's an O(t) algorithm for the problem.
The algorithm makes use of a Range-Minimum Query (RMQ) data structure with the following properties:
Given an array A[1..m] of numbers, there is a construction stage in which some internal data structures are created. This stage takes O(m) time. Once it is completed, range-min queries RM(i,j) can be answered in O(1) time per query. An RM(i,j) query (for i <= j) asks for the index k that minimizes A[k] in the range i <= k <= j. A simple and elegant implementation of such a data structure is described in a paper by Bender and Farach-Colton (BF-C). We construct the BF-C data structure on the array B.
Consider an index i. Suppose we are given some k>=i and are asked to find the largest j>=k such that F[i]>=j and B[j]<=i, we can do the following: maxJ = k-1;
while(maxJ < F(i)){ j = RM(maxJ+1, F[i]); if(B[j] > i) break; /* no more candidates */ maxJ = j; } If the desired j exists, we get it in maxJ. Otherwise, we exit the loop in the first iteration with maxJ=k-1. Each iteration takes O(1) time. Each complete iteration increases maxJ. The number of iterations is therefore bounded by 1+[the number of times maxJ increases], which is at most 1 + (maxJ+1 - k) (where maxJ in this expression refers to the value of maxJ upon exit from the loop).
The algorithm proceeds as follows. Let's enclose the above piece of code in a function find_maxJ(i,k). We then start by:
bestI = bestJ = 0;
maxJ = 0; i = 1; k = max{i, maxJ+1};
maxJ = find_maxJ(i, k); if(maxJ-i > bestJ-bestI){ bestI = i; bestJ = maxJ; } (Note that because of the initialization, max{i, maxJ+1} = i = 1.) This will find the best good pair of the form (1,j) and set bestI and bestJ to reflect this, or leave maxJ=0 if no such good pair exists.
Now suppose if a good pair exists of the form (i',j) with i'<i then (bestI, bestJ) is the best such pair, and that otherwise bestI=bestJ=0. Also suppose that maxJ contains the maximum j such that a good pair (i',j) exists for some i'<i (and that, if no such good pair exists, maxJ=0). We want to find the best good pair of the form (i',j) with i'<=i (and update bestI, bestJ, maxJ accordingly).
Regardless of the current value of maxJ, the answer can be computed by exactly the same code as above:
k = max{i, maxJ+1};
maxJ = find_maxJ(i, k); if(maxJ >= k && maxJ-i > bestJ-bestI){ bestI = i; bestJ = maxJ; } The correctness of this in the case maxJ<i (before the code is executed) is straightforward. The correctness in the case maxJ>=i follows by the following argument. By definition the situation in this case is that there exists a good pair (i',maxJ) such that i'<i<=maxJ. Therefore it is impossible for any index j, i<=j<=maxJ to satisfy j-i>maxJ-i'. Since maxJ-i' is surely not more than bestJ-bestI, it is also impossible for any such j to satisfy j-i>bestJ-bestI. We can therefore start the search from maxJ+1 (which is the value of k in this case).
The algorithm is therefore the following.
bestI = bestJ = 0;
maxJ = 0; for(i = 1; i <= t; i++){ k = max{i, maxJ+1}; maxJ = find_maxJ(i, k); if(maxJ-i > bestJ-bestI){ bestI = i; bestJ = maxJ; } } All that remains is to analyze the time complexity. Let us denote by maxJ(i) the value of maxJ at the end of the ith iteration (and maxJ(0)=0). As we have already seen, the time complexity of the call to find_maxJ in the ith iteration is bounded by:
1 + (maxJ(i)+1 - max{i, maxJ(i-1)+1}) which is at most 1 + (maxJ(i)+1 - (maxJ(i-1)+1)) which is equal to maxJ(i)-maxJ(i-1) + 1. Summing over the iterations i=1,2,...,t, yields maxJ(t)-maxJ(0) + t <= t - 0 + t = 2t. So the function calls take O(t) time in total. Everything else takes O(t) time too, so we are done.
A.F. Tuesday, July 03, 2007 I forgot to put "maxJ >=k && ..." in some of the if() statements above. A.F. Tuesday, July 03, 2007 Having thought about it some more, I realized that the BF-C data structure is not really needed. I'll write it up when I get some time. A.F. Tuesday, July 03, 2007 Sorry. The idea I had didn't work out. It seems that the BF-C data structure is needed after all. Sunday, July 08, 2007
--------------------------------------------------------------------------------------------------------
all the articles above come from http://discuss.techinterview.org/, all rights belong to original owner. August 11 心情不好最近blog里面总是算法,是不是搞到大家都不想看了呢,呵呵。周末泡在网吧,就算连打几个钟头的实况也不想写东西。来到广州一个多月了,感觉从一个火坑跳到另一个火坑,或者说人生就是一个逃离接着另一个逃离。公司的项目已经开始了,子功能一直做到十一月,整个系统计划做到明年底,操。我没什么经验,从语言特性到系统架构,再到软件工程都毫无所知,操,用老大的话来说就是worrisome。每晚都九点十点才走,回到床上都十一二点了。不说工作了,工作只是工作。说说linyang吧。她出国了。怎么说,她和我聚少离多,但是这么多年都有联系,而且关系都很暧昧。大家都很明白给不了对方什么承诺,只是觉得对方还是十年前那样,很舒服的感觉。究竟是我们真的有缘分,还是有缘无分?算了,不想了,回去洗澡,今天又给网吧贡献了二十块钱,操。 string(pattern) algorithm source偶然发现一个有趣的网址http://www-igm.univ-mlv.fr/~lecroq/string/node14.html ,包括很多字符串匹配的算法,如kmp, bm, ASS等, 但不晓得叫什么名字,估计是叫 The Gaspard of Monge institute of electronic and computer science, 法国的?也许吧。 algorithm rediscuss about shuffleSuppose we have an array
a1,a2....an,b1,b2....bn How to change this array to a1,b1,a2,b2....an,bn in O(n)/O(nlgn) time using O(1) space the following algorithm is O(nlgn), simple and effective
nlgn=4*2=8
1 2 3 4 5 6 7 8 1 2 5 6 3 4 7 8 1 5 2 6 3 7 4 8 nlgn=8*lg8=24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 9 10 11 12 5 6 7 8 13 14 15 16 1 2 9 10 3 4 11 12 5 6 13 14 7 8 15 16 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16 and the O(n) solution is much more complex, I will post it when understand
(update: the so-called O(n) solution is proofed wrong)
-------------------------------------------
Given Array A[1...2n]. concepts of number theory! (Warning: long post)
For this we consider a slightly different problem:
That of converting the array
a1 a2 ... an b1 b2 ... bn
to
b1 a1 b2 a2 .... bn an
(instead of the one given in the problem statement)
Note if we can get an O(n) time O(1) space algorithm for one, then we have an O(n) time O(1) space time algorithm for the other.
Now assume the elements are c1 c2 ... cn c(n+1) .. c2n In the resulting array (which is a permutation of the original)
The ith element from the original goes to 2i modulo 2n+1. Thus index 1 goes to 2, 2 goes to 4, 4 goes to 8 ... ultimately comes back to 1 (all working modulo 2n+1)
Note that, it is a permutation which is composed of "cycles" like the ones described above (I think Zekaric made the mistake of assuming there is just one cycle)
For instance consider the case when n=4
1 2 3 4 5 6 7 8
It should become
5 1 6 2 7 3 8 4
so 1 -> 5 -> 7 -> 8 -> 4 -> 2 -> 1 (-> means goes to)
and 3 -> 6 -> 3 So the cycles are (1 5 7 8 4 2) and (3 6)
Assume you had an oracle which told you one element from each cycle. Can you come up with an O(n) time and O(1) space algorithm?
If we do not have such an oracle, trying to come up with elements from distinct cycles becomes hard.
This is where number theory comes in:
In case 2n+1 is a power of 3 = 3^m say, using number theory, we can show that the cycles are exactly
(1, 2, 2^2, ... ) (powers of 2) (In fact these are all the numbers relatively prime to 3^m, note, working modulo 2n+1, i.e modulo 3^m)
(3, 3*2, 3*2^2, ...) (numbers of the form 3 * 2^j (3^2, 3^2 *2 ,....) numbers of the form 3^2 * 2^j ... (3^(m-1),....) numbers of the form 3^(m-1) * 2^j (will post a proof of this later if someone is interested)
Thus the "number theory oracle" has told us that the powers of 3 are guaranteed to be in different cycles! That is all we need!
So in the special case of 2n+1 being a power of 3, we can give an O(n) time and O(1) space algorithm.
Given the special case (power of 3) algorithm, we can now make this work for any n:
In the O(nlogn) solution I gave above, instead of k = n/2, pick k to be a power of 3 close to n and use the special knowledge of cycles to shuffle that left partial array in O(n) time. Now for the right partial array, we have a new n' = (n-k) and repeat finding a k' which is a power of 3 etc. Overall, time is O(n) and space is O(1).
copyrights of all the articles above are belonged to the original owner.
more details about the maths are ignored. I prefer the first solution, :) ------------------------------------------------------------------------------------------------------------ Algorithm about gas stationQ:There is a circular track. There are n gas stations of capacity ci. To go from one station to the next station it takes certain amount of gas gi. For any station the gas available may not be enough to get to the next station. Write a linear time algorithm to find the correct station in which we must start in order to complete one round around the track.
A:令a_i=c_i-g_i,那么原问题等价于一个圆周上从哪个数a_i开始,所有的连续和都非负。
反复进行如下2步操作: i)把连续的正数求和变成一个数,连续的负数求和变成一个数。 操作后正负数间隔 ii)按车行进的方向,把相邻的正数和负数相加,形成一个新的数。这个操作后圆周上的数会减少一半 反复进行上面的操作,直到圆周上的数非负。那么从上面任何一个位置开始都可以了。 时间复杂度为n+n/2+n/4+...<2n -----------------------------------------------------------------------------------------------------------------
All the articles above come from yueweitang blog 找最短的cover - Amazon面试题Q:Given two arrays A [1..n] and B[1..m], find the smallest window in A that contains all elements of B. That is, find a pair <l,k> such that A[l..k] contains B[1..m] For example, given A = 3,1,5,7,3,5,2 and B = 5,3 then the smallest window is [3,5].
A: O(nlogm+mlogm)
整个算法只对A数组进行一次扫描 维护m个变量h_1,h_2,...,h_m,h_i表示最靠近当前扫描点的一个序列B_1,B_2,...,B_i的起始位置B_1。 同时保存最小的窗口s信息。 在整个扫描的过程中不断的更新这些变量。 假设在扫描到A_j位置时碰到B_i,这时比较h_{i-1}和h_i,如果h_{i-1}<h_i,那么更新h_i=h_{i-1}。 当碰到B_m时计算出j-h_m,和已保存的窗口信息s比较,如果更小,那么更新s的信息。 这样就能找出最小的窗口了。 然后计算时间复杂度,整个扫描过程O(n),但是每次取出一个数需要判断它是否是B中的数,所以需要logm 的时间来二分查找,因为B开始没有排序过,所以需要格外的mlogm时间排序。 ----------------------------------------------------------------------------------------------
以上问题及回答都来自yueweitang blog
-----------------------------------------------------------------------------------------------
现在细想了一下,原作者说的不是很详细。
举个例子,A=abcabdcdabddcd, B=abcd,容易看出,有三个比较短的cover在A里面,长度为6,5,6。
一开始j=1,h_1=1,这样一直扫描到j=3,h_1=h_2=h_3=1;然后又遇到一个a, 这时h_1=4, 类似的,
接着h_2=4;然后碰到d,这时的h数组为4411,这时B数组已经遍历,所以得到第一个长度是6;然后,遇到c和d,
所以更新h数组为4444,得到第二个长度是5;A数组还剩下abddcd,游标j已经为8。继续扫描,遇到a和b,
h数组更新为9944,遇到dd,不必理会,因为我们需要的是c。最后遇到cd,h数组更新为9999,得到第三个长度是6。
此时A数组已经遍历,算法完成。
如果辅助空间只能是O(1)的话,那么少于O(n^2)的时间就很难了。
|
|
|