hash table
bucket sort
accumulate
]
Leetcode 0274. H-Index
https://leetcode.com/problems/h-index
Explanations
The main trick is to count for each possible number of citations, how many times we see this number in our citations
. Note, that if number of citations is more than total number of papers N
, we can reduce this numer to N
and nothing will change. Let me explain my solutoin given test example [3,0,6,1,5]
.
- We create array, which I called
buckets = [1, 1, 0, 1, 0, 2]
, wherebuckets[i]
is number of papers withi
citations ifi < N
andbucket[N]
is number of papers with>=N
citations. - Now, we create
accum
array, whereaccum[0]
is number of papers with>=1
citation,accum[1]
is number of papers with>=2
citations and so on. When we evaluate it for our example we can see, that it is equal toaccum = [4,3,3,2,2]
. Note, that we start with 1 citation, not with zero, that is why we need to useaccum[1:]
in our code. - Finally, we need to go through this array and find the bigest number
i
, for whichaccum[i] >= i + 1
and returni + 1
, in our example it is equal to3
.
Complexity
Complexity is O(n), though we traverse our data 4 times, and it is not the optimal solution. We can evaluate cumulative sums in place, and compare them in place with i + 1
and keep the index of maximum reached so far, and interrupt when inequality accum[i] >= i + 1
does not hold anymore. Howerer I like cumulative sums and code is more clean in my way.
Code
class Solution:
def hIndex(self, citations):
N = len(citations)
buckets = [0] * (N + 1)
for elem in citations:
buckets[min(elem, N)] += 1
accum = list(accumulate(buckets[1:][::-1]))[::-1]
compar = [accum[i] >= i + 1 for i in range(N)]
return (compar + [0]).index(0)
If you like the solution, you can upvote it on leetcode discussion section: Problem 0274