# Find The Needles: Python Version

Find The Needles: Python Version
I took a stab at explaining a search algorithm written in python with some suggested improvements.

``` python
def findNeedles(haystack, needlesArr):
 if len(needlesArr) > 5:
 sys.stderr.write('Too many words!')
 else:
 countArray = [0]*len(needlesArr)
 for i in range(len(needlesArr)):
 words = re.split("[ \"\'\t\n\b\f\r]", haystack)
for j in range(len(words)):
 if words[j] == needlesArr[i]:
 countArray[i] += 1
for j in range(len(needlesArr)):
 print needlesArr[j] + ": " + str(countArray[j])
findNeedles("the quick red fox jumped over the lazy brown dog", ['the'])
```

This method is used to find 'needles' or words in a haystack or array. Tt takes two parameters:
- **haystack**: the haystack or string to search for the needles or words
- **needlesArr**: the words to search for in the haystack
The algorithm is as follows:
- If the length of needles array (list) is greater than 5, it should do a stdout of `too many words`.
- If not, then initialize an array `countArray` to keep track of the word counts. 
- Loop through all numbers from 0 to the length of `needlesArr`; create the `words` variable from the regex split function
- Split the string haystack on any whitespace
- Then loop through from 0 to the length of `words`, if the item at index `j` is in the original `needlesArr` at the same index, increase the value in `countArray` by 1.
- Finally, print that item with its corresponding count.
### Improvement Suggestions
- The solution above is not very efficient as it would have It would be an `O(n)` big O time complexity.
. It is possible to do this in a single pass with a dictionary as demonstrated in the following solution:

``` python
import re
def findNeedles(haystack, needlesArr):
 count = dict.fromkeys(needlesArr, 0)
for word in re.split("[ \"\'\t\n\b\f\r]", haystack):
 if word in count:
 count[word] += 1
print(count)
findNeedles("the quick red fox jumped over the lazy brown dog", ['the'])
```

- The above solution will enable the lookups to have `O(1)` instead of `O(n)` time complexity as you don't have to compare each word to every single needle
