Wed Jan 26 18:13:40 CST 2005
Contact:
Men-Chow Chiang
chiangmc@us.ibm.com
512-838-8546
Two major bugs found in current version of AIM-7
benchmark.
1. Name conflict of temporary files.
2. The operations for each process is not executed in
truly random order.
Detail:
Bug 1. Name conflict of temporary files.
Extent: At least causing problem in current IBM AIX
system. Possibly in other operating system too.
More likely in future operating systems.
Problem: the function "aim_mktemp(char *template)"
is used to, as described by the comment
for the function, to
"replace\(s\) the contents of the string pointed to by
template with a unique file name."
Part of the unique file name is constructed from the
process id.
The current code only used the last 5 digits of the
return value from getpid().
But the current version of IBM AIX (IBM's UNIX)
uses 8 digits. Since the last
5 digits of pid is not unique in IBM environment,
during an AIM benchmark run the same temporary
file name can be generated by
different processes at the same time.
Symptom:
When the number of processes created by AIM are
large, on the order of thousands,
file name conflict occurs. The AIM benchmark does
not check for this conflict.
Therefore a file just created by a process might not
be found immediately afterward,
because it might have just been deleted by another
process. This failure, causing
a failed system call, brings down the whole
benchmark execution.
Fix: Modify "void aim_mktemp(char *template)" of file
"disk1.c".
The string variable "Xs" in the following is used to
store the unique file name.
#ifdef CHIANG3
/* fix */
/*
* changed from 5 pid char to 8 pid char
1/12/2005
*/
if (counter++ == -1) { /* initialize counter
and pid */
pid_end = getpid() % 100000000; /* use
uniqueness of pid, only need 5 digits */
} else if (counter == 100000000) /* reset,
only need 5 digits */
counter = 0;
sprintf(Xs, "%08d%05d", pid_end, counter); /*
write over XXXXXXXXXX, zero pad counter */
#else /* CHIANG3 */
/* before fix */
if (counter++ == -1) { /* initialize counter
and pid */
pid_end = getpid() % 100000; /* use
uniqueness of pid, only need 5 digits */
} else if (counter == 100000) /* reset, only need
5 digits */
counter = 0;
sprintf(Xs, "%05d%05d", pid_end, counter); /*
write over XXXXXXXXXX, zero pad counter */
#endif /* CHIANG3 */
Bug 2. The operations for each process is not executed
in truly random order.
Extent: Unlike bug 1, this bug is operating system
independent because the fault is
in the algorithm of AIM-7.
Problem: The AIM benchmark is structured by
creating multiple processes executing the same
workload mix, defined by "workfile" as its input.
Each process executes
operations specified in "workfile" in a random
order, so that the job mix of
the tested machine during benchmark execution at
any time
is about the same, or "time-homogeneous".
Unfortunately the algorithm of randomly selecting
the next operation, which is one
of the "add_double", "add_long", "disk_cp",
"jmp_test", ... is not truly random.
The bias is to execute those less common type of
operations in "workfile" at the early
life period of the process. The bias are exactly the
same for all processes hence the overall
workload for the test machine behaves rather
non-time-homogeneously.
Frequently toward the end all the
processes execute exactly the same operation at
the same time. This phenomenon was verified
by actually tracking the order of operations by each
process.
The flaw in the current code is that, it randomly
(truly randomly) select the TYPE of
operation, instead of the operation itself. A simple
example easily illustrates the point:
If the workfile contains two entries of operation:
90 add\_long
10 add\_short
The current algorithm selects, with 50%/50%
between the two types - add_long and add_short
at the beginning. Since there are much fewer
add_short operations in this workload,
the process soon exhausts all the add_short
operations. Toward the end of its life the
process only executes sequence of add_long
operations. Therefore a likely sequence,
for a load of 10 operations sequence:
L S L L L L L L L L
S L L L L L L L L L
L L S L L L L L L L
....
Whereas a truly random sequence of operations
should see "add_short" being interspersed
everywhere, such as
L S L L L L L L L L
S L L L L L L L S L
L L L L S L L L L L
L L L L L L L S L L
L L L S L L L L L L
...
That is, the add\_short is equally likely in any one of
the 10 slots.
Symptom:
I was quite baffled to see that, all my performance
statistics shows a simple,
monotonic change during AIM benchmark execution.
These include, for example,
all the virtual memory statistics generated by
"vmstat" such as frequency of interrupt and
system call, run queue length, kernel/user time
breakdown, etc. If operations in
the workload is truly randomly ordered for each
process, the workload should
be time-homogeneous.
I changed the code from randomly selecting the
TYPE of operation to randomly selecting
the OPERATIONS. The benchmark then worked
perfectly and the workload became very much
time-homogeneous.
Time-homogeneity is a very desired quality for
workload characterization and performance
tuning.
Fix: in multitask.c
First I declare at the top of multitask.c:
int work\_map\[MAX\_LOADNUM\]; /\* bitmap of
workload */
int current_total_work;
int current_select;
They are initialized in runtap\(\) by
...
/\* code to initialize the wlist\[\]
\*/
...
current\_total\_work = 0;
for \(j = 0; j < work; j++\) \{ /\* How many of
each to sample? */
for (i = 0; i < wlist[j]; i++) {
work_map[current_total_work] = j;
current_total_work++;
}
}
Later on in the same function runtap\(\), when a
process needs to choose the
next operation:
#ifdef CHIANG2
/* fix */
current_select = randnum %
current_total_work;
k = work_map[current_select];
/* Sampling without replacement */
current_total_work--;
work_map[current_select] =
work_map[current_total_work];
wlist[k]--; /* and take one
away */
#else /* CHIANG2 */
/* before fix */
k = randnum % work; /*
Sampling without replacement */
if (wlist[k] > 0) /* if there is
work to do */
wlist[k]--; /* take one
away */
else { /* else */
k = reduce_list(wlist); /*
remove it from list */
wlist[k]--; /* and take
one away */
}
#endif /* CHIANG2 */