Perl Data Language / Feature Requests / #28 make pdl() constructor non-fatal when memory runs out

Chris Marshall - 2010-07-15

tmalloc.pl

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Mertens - 2010-07-15

Another way to respond to lethally large memory would be to automatically go to memory mapping. This would require memory mapping to work on systems other than Linux/Unix, but that's doable, as I've mentioned before. Then I wonder how easily we would be able to use the OS's fragmented file handling for growing piddles on the disk?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-07-15

I think it is more important to fix the lethal failure problem more than the having enough memory problem. If perl/perldl/pdl2 doesn't exit, I can easily adjust my processing to work by smaller pieces. The annoyance is when I am doing some quick work and just slurp in a big piddle to process the OOM kills *everything*.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-07-16

From pdl-porters discussion starting here:
http://mailman.jach.hawaii.edu/pipermail//pdl-porters/2010-July/003491.html
it appears that using a pure virtual piddle for computation is not
so easy as PP calls make_physical which in turn calls pdl_make_physical
which then can call pdl_allocdata which ends up calling the dreaded
pdl_grow and *kaboom* out of memory failure.

Maybe there is a way to do the memory allocation ourselves so we can
catch any NULL returns from malloc and handle them with die rather
than the builtin exit of perl. I'll add this to the known_problems since
it really will bite you if you try to work with large data sets.

I've entered this as a Feature Request but I would almost classify this
as a bug since the difference between the perl interpreter having
enough memory to function and having enough contiguous memory
to allocate a huge piddle are two very different things.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-07-16

A possible work-around (hack) to fix the failure might be
to add an extra call to malloc() in the pdl memory allocation
routine (e.g., pdl_grow) for the amount of memory you want
to get from perl (maybe rounded up some bit). If the malloc
succeeds, free the memory and allow the perl call to
proceed. If the malloc returns NULL, then return the
appropriate value or die as required (at least that would
be catchable by eval and the interpreter would not exit).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-07-16

I notice that in cygwin, the large request error is returned rather
than the out of memory one which apparently is not fatal to the
pdl2 shell. It is terminal for Strawberry Perl Portable...

CYGWIN PERL:

PDL> $huge = ones(100000000)
Runtime error: PDL: Problem with assignment: Out of memory during "large" request for 1073745920 bytes, total sbrk() is 12972032 bytes.
Caught at file (eval 339), line 5, pkg main

PDL> p pdl(2,3)
[2 3]

STRAWBERRY PERL:

PDL> $huge = ones(100000000)
Out of memory!
Callback called exit at C:\local\strawberry\perl\bin/pdl2 line 29.
BEGIN failed--compilation aborted at C:\local\strawberry\perl\bin/pdl2 line 29.

E:\chm>

We can see the win32 perl has exited and I'm back to the CMD prompt.
The cygwin perl prints a warning and is good for more....

--Chris

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-07-20

summary: make pdl() constructor non-fatal --> make pdl() constructor non-fatal when memory runs out
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-08-01

Lower priority---this is not on the PDL-2.4.7 critical path.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-08-01

priority: 5 --> 3
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-09-21

Additional information on the cygwin out-of-memory (OOM) failures:

(1) It seems to happen a lot sooner than I would expect it should
even with the 300MB limit for processes.

(2) I notice that the cygwin perl was configured with usemymalloc=y
and the strawberry perl is n. Given other reported problems with
using the perl malloc and PDL for other systems (e.g., Solaris), I
would like to try building my own cygwin perl with all settings the
same except for the malloc to see if that helps things..

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2010-10-14

On further experimentation with cygwin 1.7 perl
and strawberry perl under windows vista, I've
determined that the smaller the largest single
memory allocation, the less overhead from the
perl malloc and the closer you can get to the
platform limits.

For example, on cygwin with a maximum
allowed memory of 1024MB, if I grow by
4MB piddles, I can allocate 1016MB before
the out-of-memory (OOM) failure. If I
allocate by 5MB piddles, then the fail
occurs at 635MB.

On strawberry perl (which is compiled with
usemymalloc=n) the overhead appears
to be less and on can get close to the
maximum allowed memory (within 2-3X
the largest allocation size) for a wider
range of pdls. For large pdl allocations,
you do run out of memory earlier but it
seems to be a smaller fraction of the
total as compared to cygwin perl, and
presumably the perl malloc).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-01-09

I posed the question to the Perl Monks:
http://perlmonks.org/index.pl?node_id=880280
and the gist of the discussion is that there is
no way to prevent perl from exiting on malloc
failure.

It looks like the only real fix would be for us to
handle the pdl data allocation ourselves although
we could use perl for the other parts of the pdl.
The hooks for clean up for mmap'd pdls seem like
they would work for our own allocations (with
some tweaking, perhaps).

The difficult part will be verifying that the new
code works and to make sure that performance
is not adversely affected.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-09-06

Looking up some perlapi docs, I came across the sv_usepvn_flags
routine which allows one to have the string data pointer of an SV
outside of the SV. If the new string is already null terminated, then
no realloc will happen so things should work. This could allow us
to catch failed malloc operations and handle them less fatally than
the perl5 default.
sv_usepvn_flags
Tells an SV to use "ptr" to find its string value. Normally
the string is stored inside the SV but sv_usepvn allows the SV
to use an outside string. The "ptr" should point to memory
that was allocated by "malloc". The string length, "len", must
be supplied. By default this function will realloc (i.e. move)
the memory pointed to by "ptr", so that pointer should not be
freed or used by the programmer after giving it to sv_usepvn,
and neither should any pointers from "behind" that pointer
(e.g. ptr + 1) be used.

If "flags" & SV_SMAGIC is true, will call SvSETMAGIC. If
"flags" & SV_HAS_TRAILING_NUL is true, then "ptr[len]" must be
NUL, and the realloc will be skipped. (i.e. the buffer is
actually at least 1 byte longer than "len", and already meets
the requirements for storing in "SvPVX")

void sv_usepvn_flags(SV* sv, char* ptr, STRLEN len, U32 flags)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-09-06

In addition to handling the malloc for PDL data ourselves,
we need to check the return values for the mallocs we use
to create the pdl, threads, and transform structures. It would
be useful to have a way to test code against failing mallocs.
Maybe we could have a wrapper that fakes things?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-09-09

It may be cleaner to implement a pdl_safe_sv_grow version of
Sv_GROW since it appears that the places where the PDL data
is actually allocated is via the SvGROW perl api call.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-09-13

Increasing the Priority as this is a perl crash issue that I
would like to see addressed now that some understanding
of how to proceed has been laid out.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2011-09-13

priority: 3 --> 7
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

mohawk - 2022-04-14

status: open --> closed

Group: -->
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

mohawk - 2022-04-14

malloc failures now throw Perl exceptions

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

make pdl() constructor non-fatal when memory runs out

Group

Searches

Help

#28 make pdl() constructor non-fatal when memory runs out

Discussion