Find The Needles: Python Version

Find The Needles: Python Version I took a stab at explaining a search algorithm written in python with some suggested improvements.

def findNeedles(haystack, needlesArr):
 if len(needlesArr) > 5:
 sys.stderr.write('Too many words!')
 else:
 countArray = [0]*len(needlesArr)
 for i in range(len(needlesArr)):
 words = re.split("[ \"\'\t\n\b\f\r]", haystack)
for j in range(len(words)):
 if words[j] == needlesArr[i]:
 countArray[i] += 1
for j in range(len(needlesArr)):
 print needlesArr[j] + ": " + str(countArray[j])
findNeedles("the quick red fox jumped over the lazy brown dog", ['the'])

This method is used to find 'needles' or words in a haystack or array. Tt takes two parameters:

haystack: the haystack or string to search for the needles or words
needlesArr: the words to search for in the haystack The algorithm is as follows:
If the length of needles array (list) is greater than 5, it should do a stdout of too many words.
If not, then initialize an array countArray to keep track of the word counts.
Loop through all numbers from 0 to the length of needlesArr; create the words variable from the regex split function
Split the string haystack on any whitespace
Then loop through from 0 to the length of words, if the item at index j is in the original needlesArr at the same index, increase the value in countArray by 1.
Finally, print that item with its corresponding count.
Improvement Suggestions
The solution above is not very efficient as it would have It would be an O(n) big O time complexity. . It is possible to do this in a single pass with a dictionary as demonstrated in the following solution:

import re
def findNeedles(haystack, needlesArr):
 count = dict.fromkeys(needlesArr, 0)
for word in re.split("[ \"\'\t\n\b\f\r]", haystack):
 if word in count:
 count[word] += 1
print(count)
findNeedles("the quick red fox jumped over the lazy brown dog", ['the'])

The above solution will enable the lookups to have O(1) instead of O(n) time complexity as you don't have to compare each word to every single needle

Command Palette

Improvement Suggestions

Comments