In this example, we have such as matrix containing one million rows, each having six columns. Each row in the matrix contains unique random integer values between 1 and 10.
We'll collect the number of the matching needle elements in each row of the haystack into matchcount. The number of the individual elements matching to the matrix rows is also collected into numcount.
haystack=zeros(1000000,6); for i=1:1000000 haystack(i,:)=randperm(10,6); end needle = [1 2 3 4 5]; matchcount = zeros(5, 1); numcount = zeros(5, 1);
There doesn't seem to be good vectorized method to find the intersections, so we'll use for loops in all the three examples.
In the first code, we'll use intersect() to get the matching indices for each row. Intersect() does not require the inputs to be sorted, so it works out-of-the-box in our case. Intersect returns the common elements to the needle and the rows of the haystack. We'll keep count of these matches.
for i = 1:length(haystack) match_ids = intersect(needle, haystack(i, :)); matches = length(match_ids); if matches > 0 matchcount(matches) = matchcount(matches) + 1; numcount(match_ids) = numcount(match_ids) + 1; end end
Execution of this code takes almost one minute; the average for three runs was 58.08 seconds.The results of a single run look like this:
>> disp([1:5;numcount';matchcount']) 1 2 3 4 5 599684 600619 599554 599802 601223 23761 237749 476269 238289 23932
To get the same results, we can also use ismember() which returns the logical array of the locations of the matching elements instead of the actual elements. This means we have to change the counting of matches from length() to sum():
for i = 1:length(haystack) match_ids = ismember(needle, haystack(i, :)); matches = sum(match_ids); if matches > 0 matchcount(matches) = matchcount(matches) + 1; numcount(match_ids) = numcount(match_ids) + 1; end end
This method is already much faster; average execution time for three runs was now only 15.13 seconds. The results are still looking correct:
>> disp([1:5;numcount';matchcount']) 1 2 3 4 5 600486 600681 599275 599443 599843 23841 238526 475544 238242 23847
To make the computation even faster, we have to identify the bottlenecks and remove as many of them as possible. To find out where ismember() uses the most of the time, we can use profile(). Using MATLAB Profiler will be discussed more in some later post; now we will get straight to the findings.
It can be seen in the Profile report, that most of the time is used to verify the input arguments and branch to the correct parts of the code based on the types of the arguments. As we know the types of the arguments and we can be sure they do not contain NaNs or anything else requiring additional operations, we can skip most of the code.
Actually we can skip all the MATLAB code and go straight to an undocumented builtin function which is called deep in the ismember(). This undocumented function is _ismemberhelper() and it needs to be called with a special function builtin(). This function requires the arguments to be sorted, so we need additional sort() for the haystack:
for i = 1:length(haystack) match_ids = builtin('_ismemberhelper', needle, sort(haystack(i, :))); matches = sum(match_ids); if matches > 0 matchcount(matches) = matchcount(matches) + 1; numcount(match_ids) = numcount(match_ids) + 1; end end
For this code, the execution time reduced to 7.69 seconds.
And the results still look correct:
>> disp([1:5;numcount';matchcount']) 1 2 3 4 5 599938 599039 599923 600895 599592 23879 237804 476875 237935 23507
To summarize the results, here are the execution times for all the three methods:
- intersect() - 58.08 seconds
- ismember() - 15.13 seconds
- _ismemberhelper() - 7.69 seconds
No comments:
Post a Comment