The current way PDL allocates memory for a new piddle
is via the SvGROW perl API call. The problem with this
is that 'Out of memory!' is a fatal error for the perl interpreter.
This makes it impossible to repond to memory limits
in any smart way since the first failure will exit perl itself
(kind of like the end of the universe as far as program
execution goes). If the constructor used die instead,
then it would be possible to catch a problem and retry
with a different size piddle.
An additional point: a lot of times the memory is
there, just not available as a contiguous memory
region. It would be very nice if it were possible to
use dataflow handling to chain together smaller
allocations to create an array of the needed size.
e.g.:
On strawberry perl, adding new piddles in a loop,
I can get up to 1973MB if I do it a MB at a time.
With 10MB at a time, I get to 1980MB.
With 100MB at a time, I get to 1400MB.
With 200MB at a time, I get to 1000MB.
With 300MB at a time, I get to 900MB.
With 400MB at a time, I get to 800MB...
So you can see the ability to use smaller pieces
of memory to make a new piddle could more
than double the available space. Non-lethality
of allocations would prevent temporaries being
generated from killing your PDL session.
Test script attached to ticket...
Another way to respond to lethally large memory would be to automatically go to memory mapping. This would require memory mapping to work on systems other than Linux/Unix, but that's doable, as I've mentioned before. Then I wonder how easily we would be able to use the OS's fragmented file handling for growing piddles on the disk?
I think it is more important to fix the lethal failure problem more than the having enough memory problem. If perl/perldl/pdl2 doesn't exit, I can easily adjust my processing to work by smaller pieces. The annoyance is when I am doing some quick work and just slurp in a big piddle to process the OOM kills *everything*.
From pdl-porters discussion starting here:
http://mailman.jach.hawaii.edu/pipermail//pdl-porters/2010-July/003491.html
it appears that using a pure virtual piddle for computation is not
so easy as PP calls make_physical which in turn calls pdl_make_physical
which then can call pdl_allocdata which ends up calling the dreaded
pdl_grow and *kaboom* out of memory failure.
Maybe there is a way to do the memory allocation ourselves so we can
catch any NULL returns from malloc and handle them with die rather
than the builtin exit of perl. I'll add this to the known_problems since
it really will bite you if you try to work with large data sets.
I've entered this as a Feature Request but I would almost classify this
as a bug since the difference between the perl interpreter having
enough memory to function and having enough contiguous memory
to allocate a huge piddle are two very different things.
A possible work-around (hack) to fix the failure might be
to add an extra call to malloc() in the pdl memory allocation
routine (e.g., pdl_grow) for the amount of memory you want
to get from perl (maybe rounded up some bit). If the malloc
succeeds, free the memory and allow the perl call to
proceed. If the malloc returns NULL, then return the
appropriate value or die as required (at least that would
be catchable by eval and the interpreter would not exit).
I notice that in cygwin, the large request error is returned rather
than the out of memory one which apparently is not fatal to the
pdl2 shell. It is terminal for Strawberry Perl Portable...
CYGWIN PERL:
PDL> $huge = ones(100000000)
Runtime error: PDL: Problem with assignment: Out of memory during "large" request for 1073745920 bytes, total sbrk() is 12972032 bytes.
Caught at file (eval 339), line 5, pkg main
PDL> p pdl(2,3)
[2 3]
STRAWBERRY PERL:
PDL> $huge = ones(100000000)
Out of memory!
Callback called exit at C:\local\strawberry\perl\bin/pdl2 line 29.
BEGIN failed--compilation aborted at C:\local\strawberry\perl\bin/pdl2 line 29.
E:\chm>
We can see the win32 perl has exited and I'm back to the CMD prompt.
The cygwin perl prints a warning and is good for more....
--Chris
Lower priority---this is not on the PDL-2.4.7 critical path.
Additional information on the cygwin out-of-memory (OOM) failures:
(1) It seems to happen a lot sooner than I would expect it should
even with the 300MB limit for processes.
(2) I notice that the cygwin perl was configured with usemymalloc=y
and the strawberry perl is n. Given other reported problems with
using the perl malloc and PDL for other systems (e.g., Solaris), I
would like to try building my own cygwin perl with all settings the
same except for the malloc to see if that helps things..
On further experimentation with cygwin 1.7 perl
and strawberry perl under windows vista, I've
determined that the smaller the largest single
memory allocation, the less overhead from the
perl malloc and the closer you can get to the
platform limits.
For example, on cygwin with a maximum
allowed memory of 1024MB, if I grow by
4MB piddles, I can allocate 1016MB before
the out-of-memory (OOM) failure. If I
allocate by 5MB piddles, then the fail
occurs at 635MB.
On strawberry perl (which is compiled with
usemymalloc=n) the overhead appears
to be less and on can get close to the
maximum allowed memory (within 2-3X
the largest allocation size) for a wider
range of pdls. For large pdl allocations,
you do run out of memory earlier but it
seems to be a smaller fraction of the
total as compared to cygwin perl, and
presumably the perl malloc).
I posed the question to the Perl Monks:
http://perlmonks.org/index.pl?node_id=880280
and the gist of the discussion is that there is
no way to prevent perl from exiting on malloc
failure.
It looks like the only real fix would be for us to
handle the pdl data allocation ourselves although
we could use perl for the other parts of the pdl.
The hooks for clean up for mmap'd pdls seem like
they would work for our own allocations (with
some tweaking, perhaps).
The difficult part will be verifying that the new
code works and to make sure that performance
is not adversely affected.
Looking up some perlapi docs, I came across the sv_usepvn_flags
routine which allows one to have the string data pointer of an SV
outside of the SV. If the new string is already null terminated, then
no realloc will happen so things should work. This could allow us
to catch failed malloc operations and handle them less fatally than
the perl5 default.
sv_usepvn_flags
Tells an SV to use "ptr" to find its string value. Normally
the string is stored inside the SV but sv_usepvn allows the SV
to use an outside string. The "ptr" should point to memory
that was allocated by "malloc". The string length, "len", must
be supplied. By default this function will realloc (i.e. move)
the memory pointed to by "ptr", so that pointer should not be
freed or used by the programmer after giving it to sv_usepvn,
and neither should any pointers from "behind" that pointer
(e.g. ptr + 1) be used.
If "flags" & SV_SMAGIC is true, will call SvSETMAGIC. If
"flags" & SV_HAS_TRAILING_NUL is true, then "ptr[len]" must be
NUL, and the realloc will be skipped. (i.e. the buffer is
actually at least 1 byte longer than "len", and already meets
the requirements for storing in "SvPVX")
void sv_usepvn_flags(SV* sv, char* ptr, STRLEN len, U32 flags)
In addition to handling the malloc for PDL data ourselves,
we need to check the return values for the mallocs we use
to create the pdl, threads, and transform structures. It would
be useful to have a way to test code against failing mallocs.
Maybe we could have a wrapper that fakes things?
It may be cleaner to implement a pdl_safe_sv_grow version of
Sv_GROW since it appears that the places where the PDL data
is actually allocated is via the SvGROW perl api call.
Increasing the Priority as this is a perl crash issue that I
would like to see addressed now that some understanding
of how to proceed has been laid out.
malloc failures now throw Perl exceptions