Big double arrays in C HELP!!!

Fusengammu · Dec 20, 2006

Hi, I want to run this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
double y[100000000]; # that's a double array of size 10^8

y[0] = 1;
y[99999999] = 1;

printf("%f %f\n", y[0], y[size - 1]);
}

I have a dual 2.5GHz G5 w/ 2.5G of RAM. I've tried changing the shared memory size :

kern.sysv.shmmax: 536870912
kern.sysv.shmmin: 1
kern.sysv.shmmni: 4096
kern.sysv.shmseg: 4096
kern.sysv.shmall: 131072
kern.sysv.semmni: 87381
kern.sysv.semmns: 87381
kern.sysv.semmnu: 87381
kern.sysv.semmsl: 87381
kern.sysv.semume: 10

and I've tried setting ulimit -s to a large #, but it won't go above 65536.

Incidentally, the above program runs fine on a dual Opteron HP box running RedHat w/ 2G of RAM.

This is totally confusing me, help!

Thanks All!

philjs · Dec 20, 2006

System V IPC is not involved here so you can ignore shared memory settings.

Try "ulimit -s unlimited". Unfortunately I don't have an OS X system or kernel source infront of me at the moment to verify the actual maximum stack size.

May I ask why you want to use such a large automatic variable?

Regards,

Phil

Fusengammu said:
Hi, I want to run this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
double y[100000000]; # that's a double array of size 10^8

y[0] = 1;
y[99999999] = 1;

printf("%f %f\n", y[0], y[size - 1]);
}

I have a dual 2.5GHz G5 w/ 2.5G of RAM. I've tried changing the shared memory size :

kern.sysv.shmmax: 536870912
kern.sysv.shmmin: 1
kern.sysv.shmmni: 4096
kern.sysv.shmseg: 4096
kern.sysv.shmall: 131072
kern.sysv.semmni: 87381
kern.sysv.semmns: 87381
kern.sysv.semmnu: 87381
kern.sysv.semmsl: 87381
kern.sysv.semume: 10

and I've tried setting ulimit -s to a large #, but it won't go above 65536.

Incidentally, the above program runs fine on a dual Opteron HP box running RedHat w/ 2G of RAM.

This is totally confusing me, help!

Thanks All!

Fusengammu · Dec 20, 2006

Hm, well I want such a large range because I want to calculate the value of a Gamma function near 0, and this requires a numerical integration, and I am using an extended trapezoidal rule. Yes, for such kinds of ill-behaved integrands, it is probably better to use a variable interval-size integration routine, but I just wanted to use a quick and ready-to-use routine that I already have implemented.

And besides, I have 2.5G of RAM, I can't imagine I wouldn't be able to allocate 80 million doubles. Yes, its pointless, but I still should be able to do.

And ulimit -s unlimited as a regular user returns

-bash: ulimit: stack size: cannot modify limit: Operation not permitted

as root, it goes through, but doing
ulimit -s unlimited
returns "65536"

so bash just made me a false promise.

I don't necessarily have to use such a knuckle-headed approach for my problem, but I'm also sort of curious why RedHat lets me ulimit to unlimited while my G5 limits me to 64M, even as root and lies to me about it???

Actually, there are other times besides integrating that I do want one program to be able to allocate a large chunk of memory, like at least 100MB. I sometimes simulate large networks of neurons, and unfortunately my G5 refuses to do it, while the HP Linux box happily simulates away.

Thanks philjs

lurk · Dec 20, 2006

The question still stands as to why are you trying to use such a large array as an automatic variable. The puppy really needs to be allocated on the heap, if you had used malloc to allocate your array it would have worked out fine.

Now, as to why you cannot have an arbitrarily large stack. Roughly the address space for a program is set up so that there is a stack growing in one direction and the heap growing in the opposite direction. If you put one at each end of your free space they can grow towards each other in the middle. This works pretty well until you try to add a second thread with a second stack to your process, where are you going to put it? No matter what you pick it will either run into the other stack or the heap as they grow.

This cannot be solved in a flat memory space, a segmented model is different, so a compromise is needed. The sweet spot appears to be to limit the size of the stack since it usually is much more bounded. On the Mac this limit is set at 64k per stack and generally is more than enough. This is also a hard limit, to change it you have to recompile, so when you ask bash to set the limit it can only set it within the range of the soft limits available. (You might very well want to set the stack limit lower than 64k for many reasons) So it is not that bash lied to you, but rather that you requested the impossible.

Finally, to your Linux lets me point. Linux only lets you have one "unlimited" stack, as soon as you introduce a second thread there one of them will become limited by the location of the other thread's stack. This is actually one of the biggest warts on the Linux model of threading, because the OS does not have an insight into the threads in processes you cannot ask thread related questions of the OS. I like that ps on my mac can tell me how many threads are executing in a given process for instance, you cannot do that on Linux.

I don't know if that helps or not, but the short of it is don't use huge automatic variables. They are bad for more than just the reasons I mentioned above.

Good Luck!

philjs · Dec 20, 2006

I think Lurk has covered the main issue with using the stack.

With 2.5GB RAM you should be able to address all available free RAM from a 32 bit user process. In this respect Mac OS X is more flexible than Linux for 32 bit applications because it employs separate 4GB virtual address spaces for each user process and the kernel whilst Linux by default splits a single 4GB address space, 3GB user, 1GB kernel. There are pros and cons for either approach. The user address space is then further subdivided into areas for dynamically linked shared libraries the process itself.

I would recommend dynamically allocating your array on the heap as Lurk mentioned. If you should hit virtual address space limitations I would then suggest compiling your application as 64 bit but this is only likely to become an issue if you fit more memory in your G5. Here my knowledge of PPC Macs is sketchy however you should have sufficient run time support for a basic C application to execute, .e.g a 64 bit libc. Not all libraries are provided in 32 and 64 bit forms at least until the release of Leopard (10.5) however I know that I have a 64 bit libc on my Intel Mac, Xcode / gcc compiles 64 bit apps okay. I believe that 64 bit apps are better supported on PPC Macs at this point in time, though I have not spent much time investigating this area.

If you are interested and have a copy of the Mac OS X Internals book, then p 910 gives the virtual address map for a 32 bit user space process on 10.4.

Regards,

Phil

Fusengammu said:
Hm, well I want such a large range because I want to calculate the value of a Gamma function near 0, and this requires a numerical integration, and I am using an extended trapezoidal rule. Yes, for such kinds of ill-behaved integrands, it is probably better to use a variable interval-size integration routine, but I just wanted to use a quick and ready-to-use routine that I already have implemented.

And besides, I have 2.5G of RAM, I can't imagine I wouldn't be able to allocate 80 million doubles. Yes, its pointless, but I still should be able to do.

And ulimit -s unlimited as a regular user returns

-bash: ulimit: stack size: cannot modify limit: Operation not permitted

as root, it goes through, but doing
ulimit -s unlimited
returns "65536"

so bash just made me a false promise.

I don't necessarily have to use such a knuckle-headed approach for my problem, but I'm also sort of curious why RedHat lets me ulimit to unlimited while my G5 limits me to 64M, even as root and lies to me about it???

Actually, there are other times besides integrating that I do want one program to be able to allocate a large chunk of memory, like at least 100MB. I sometimes simulate large networks of neurons, and unfortunately my G5 refuses to do it, while the HP Linux box happily simulates away.

Thanks philjs

Viro · Dec 21, 2006

Code:

#include <stdio.h>
#include <stdlib.h>
 
int main(int argc, char** argv)
{
double *y;
y = malloc(sizeof(*y) * 100000000); //that's a double array of size 10^8 
 
  y[0] = 1;
  y[99999999] = 1;
 
  printf("%f %f\n", y[0], y[size - 1]);

//when done with the array, release the memory
free(y);
return 0;
}

The above code should work on any machine. As has been stated you shouldn't try to declare an array that is so large on the stack. Use the heap instead, via functions like malloc and free.

Fusengammu · Dec 21, 2006

Aha, thanks. I'm not too knowledgeable about stacks and heaps. My general impression was that stacks are easier to use when inside a function because the memory allocated to the stack variable automatically gets garbage collected. Being a lazy programmer who just wants programs to work, I only use the heap when I need a multi-dimensional array:

double **x; // x[10][20];
x = (double**)malloc(sizeof(double*)*10);

for( i = 0; i < 10; i++ )
{
x = (double*)malloc(sizeof(double)*20);
}

because I can't do

double x[10][20];

I guess its time to learn a little bit more about programming. I treat memory like a one night stand: when I want it, I want it quick and easy. When its all over, I want her to disappear without a trace without leaving me with bad memories.

So malloc it is. One question though, I use C++, I can malloc arrays of class objects, yes?

And yes, my headers are
#include <cstdio>
#include <cstdlib>

Big double arrays in C HELP!!!

Fusengammu

Registered

philjs

Registered

Fusengammu

Registered

lurk

MitÃ¤?

philjs

Registered

Viro

Registered

Fusengammu

Registered