hash table
bucket sort
accumulate
]
Leetcode 0274. H-Index
https://leetcode.com/problems/h-index
Explanations
The main trick is to count for each possible number of citations, how many times we see this number in our citations. Note, that if number of citations is more than total number of papers N, we can reduce this numer to N and nothing will change. Let me explain my solutoin given test example [3,0,6,1,5].
- We create array, which I called
buckets = [1, 1, 0, 1, 0, 2], wherebuckets[i]is number of papers withicitations ifi < Nandbucket[N]is number of papers with>=Ncitations. - Now, we create
accumarray, whereaccum[0]is number of papers with>=1citation,accum[1]is number of papers with>=2citations and so on. When we evaluate it for our example we can see, that it is equal toaccum = [4,3,3,2,2]. Note, that we start with 1 citation, not with zero, that is why we need to useaccum[1:]in our code. - Finally, we need to go through this array and find the bigest number
i, for whichaccum[i] >= i + 1and returni + 1, in our example it is equal to3.
Complexity
Complexity is O(n), though we traverse our data 4 times, and it is not the optimal solution. We can evaluate cumulative sums in place, and compare them in place with i + 1 and keep the index of maximum reached so far, and interrupt when inequality accum[i] >= i + 1 does not hold anymore. Howerer I like cumulative sums and code is more clean in my way.
Code
class Solution:
def hIndex(self, citations):
N = len(citations)
buckets = [0] * (N + 1)
for elem in citations:
buckets[min(elem, N)] += 1
accum = list(accumulate(buckets[1:][::-1]))[::-1]
compar = [accum[i] >= i + 1 for i in range(N)]
return (compar + [0]).index(0)
If you like the solution, you can upvote it on leetcode discussion section: Problem 0274