I've written an awk script that shouldn't, in theory, be taking up that much memory but in practice it is crashing due to running out of virtually memory on my server (like 800GB). It does store about 3200 strings in an associative array but that part doesn't seem to take much memory. It's the subsequent accessing of the array elements that is causing the memory footprint to sore. The following script illustrates this behaviour:
BEGIN { idx = 1 }
# Store the first 800 words
NR < 801 { word[idx++] = $1 }
# Now test whether accessing a stored array value increases
# the memory load
NR > 800 {
for (i=1; i<800; i++)
printf("%d\t%d\t%s\n", NR, i, word);
}
if one calls this 'test.awk' and run it on the builtin dictionary (awk -f test.awk /usr/share/dict/words) all is does is store the first 800 words in the dictionary file in a simple array and this doesn't take much memory. But the second part just keeps printing the damn things over and over again and this starts seriously running up the memory requirements. I watch the process grow in memory using Activity monitor and don't get it. In contrast, if you replace "word" in the printf() statement with "duh" (a constant string) the memory profile is perfectly flat over time. So it is something about accessing the array element that is costing memory (using malloc()s).
Can anyone explain this to me? It's ruining an otherwise reasonable script and doing my head in.
Thanks, in advance.
BEGIN { idx = 1 }
# Store the first 800 words
NR < 801 { word[idx++] = $1 }
# Now test whether accessing a stored array value increases
# the memory load
NR > 800 {
for (i=1; i<800; i++)
printf("%d\t%d\t%s\n", NR, i, word);
}
if one calls this 'test.awk' and run it on the builtin dictionary (awk -f test.awk /usr/share/dict/words) all is does is store the first 800 words in the dictionary file in a simple array and this doesn't take much memory. But the second part just keeps printing the damn things over and over again and this starts seriously running up the memory requirements. I watch the process grow in memory using Activity monitor and don't get it. In contrast, if you replace "word" in the printf() statement with "duh" (a constant string) the memory profile is perfectly flat over time. So it is something about accessing the array element that is costing memory (using malloc()s).
Can anyone explain this to me? It's ruining an otherwise reasonable script and doing my head in.
Thanks, in advance.