Posted 1030 days ago
As a follow up to yesterday's post about using intermediate datastructures, I wanted to add some concrete numbers to my Big-Oh runtime statements.
I came in to work this morning and got my hands on what I thought was a relatively large sample set of data, and setup an operation that would populate an array with 2000 items, and them "keep" items from an array with 350 elements. The result was that 300 elements were left in my array.
The original alorithm I had to work with was as follows:
keep( source, keepers ) {
for each element S in source
found = false;
for each element K in keepers
if ( S == K ) found = true; break;
if ( not found )
delete S from source
}
So source is size 2000, keepers is size 350, and I
know from observation that I'm going to delete (2000-300) elements from source.
The overall (worst-case) runtime then is 2000 * 350 + 2000 * 1650 = 4,000,000 steps!
My re-implementation using intermediate datastuctures looks roughly like this:
keep( source, keepers ) {
Populate set S(s) with elements of source
Populate set S(k) with elements of keepers
for each element E in source
if ( E is not found in S(k) )
remove E from S(s)
Reset source array to be empty with size length(S(s))
Copy contents of S(s) into source
}
The runtime here then is: 2000log(2000) + 350log(350) + 2000*log(350) + 1650 = 43,625 steps.
Now in my book, 43625 if significantly fewer steps than 4,000,000. But let's face it, when your average user is running well over 2 billion cycles a second, and you're talking about arrays of pointers, four million operations is nothing :-\.
I guess it's time to hunt for a much larger supply of data :-\
Posted 1031 days ago
It never ceases to amaze me the drastic impact our choice of datastructures has on application performance, and how little thought many developers seem to give it. I was charged with the task of optimizing a specific feature of our application at work this week involving set logic -- since set logic is so common and well defined, I was nervous that there'd be little improvement to be had short of optimizing how the data in question was physically stored. Much to my horror, I found several blocks of code that did something like this:
void <method_name>( CObArray &aryObjects1, CObArray &aryObjects2)
{
int i, j;
for ( i = aryObjects1.GetSize() - 1; i >= 0; i-- )
{
CObject * obj1 = aryObjects1.GetAt(i);
for ( j = aryObjects2.GetSize() - 1; j >= 0; j-- )
{
CObject * obj2 = aryObjects2.GetAt(j);
if ( obj1 == obj2 )
{
// do something
}
}
}
}
So right off the bat I have two major complaints:
- Who picks arrays to perform set logic? Especially a CObArray -- use a SET CLASS for SET LOGIC, I BEG OF YOU.
- Why are we looping backwards? There's nothing wrong with it, but it's much more intuitive to start with 0 and work up.
do something
block above then, looked like this:
delete obj2; aryObjects2.RemoveAt( j );
While this seems innocent enough, that CObArray::RemoveAt(int) call just bumped your runtime from a
horrendous O( n^2 ) to an even worse O( n^3 )!! Remember, this is an array, not a list -
therefore deletion time is O( n )...
My immediate though was to just swap out the datastructure for something better, like a set, but as I found out, you can't always come into production code and start swapping out structures. There were thousands of places in the application that relied on this array being a CObArray instance :-\. So I decided to make use of an intermediate datastructure to optimize the task at hand.
The STL provides a number of template classes implementing various datastructures. I'll show you my re-write first, and then we'll discuss it's subtletees:
struct MyComparator
{
bool operator()( CObject* obj1, CObject* obj2 )
{
return obj1->Compare(obj2) < 0;
}
}
typedef std::set< CObject*, MyComparator > ObjectSet
// Removes all objects occurring in aryObjects1 from
// aryObjects2 in O( n * log(n) ) time.
void RemoveAll( CObArray &aryObjects1,
CObArray &aryObjects2 )
{
ObjectSet myset;
for ( int i = 0; i < aryObjects2.GetSize(); ++i )
{
myset.insert( aryObjects2.GetAt(i) );
}
ObjectSet::iterator itr;
for ( int j = 0; j < aryObjects1.GetSize(); ++j )
{
itr = myset.find( aryObjects1.GetAt( j ) );
if ( itr != myset.end() ) {
delete *itr;
myset.erase(itr);
}
}
aryObjects2.RemoveAll();
aryObjects2.SetSize( myset.size() );
int i = 0;
for ( itr = myset.begin();
itr != myset.end();
itr++ )
{
aryObjects2.SetAt( i++, *itr );
}
}
It works by first creating a set with all the elements from aryObjects2. This takes O(n logn) time.
I then iterator aryObjects1, searching for each item in the set I just built. This takes O(logn) per lookup. If I
found the item, then I delete it from the set, which is a O(logn) operation as well. The final step is to copy the
items from the set back into the target array. Note that I allocate the memory space all at once to avoid having to
reallocate and copy the array items each time it grows.
This last point can cause major problems for you as well. I used a technique similar to this one to optimize an
AddAll method as well -- the add all had an O( n^3 ) loop to detect duplicates, and then at the end
did something like this:
aryObjects1.RemoveAll();
for ( int j = 0; j < aryTemp.GetSize(); j++ ) {
aryObjects.Add( aryTemp.GetAt(i) );
}
While this looks like a O(n) on the surface, my handy debugger showed me that it was actually
"growing" the array one insert at a time -- so for each insert, it allocated new memory space, copied the entire
array over, and then inserted the item. That's O(n^2)! By simply realizing that we already knew how
big the array was going to be, and statically allocating it up front, we can eliminate an order of magnitude
from the runtime (that one's for you, Doug ;))!
So by simply introducing an intermediate datastructure to perform a specific task, you can greatly improve your runtime performance, while not having to recode for a new datastructure.
add to
del.icio.us