Unable to allocate memory

```Python 3.9.20 (main, Mar 13 2025, 07:28:52)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-24)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyreadstat as prs
>>> d,m=prs.read_dta("test.dta")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyreadstat/pyreadstat.pyx", line 296, in pyreadstat.pyreadstat.read_dta
  File "pyreadstat/_readstat_parser.pyx", line 1282, in pyreadstat._readstat_parser.run_conversion
  File "pyreadstat/_readstat_parser.pyx", line 955, in pyreadstat._readstat_parser.run_readstat_parser
  File "pyreadstat/_readstat_parser.pyx", line 877, in pyreadstat._readstat_parser.check_exit_status
pyreadstat._readstat_parser.ReadstatError: Unable to allocate memory
>>>
```
From my investigation, the issue is caused by L451 in readstat_dta_read.c, within dta_read_strls() function.  It allocates  memory for each string separately in a `while` loop.  Later, at L445, the code is unable to allocate a large continuous chunk of memory because the heap is heavily fragmented.  

With the reproducible example https://www.dropbox.com/scl/fi/sx9cz7vjekvud3ail9ph3/test.dta?rlkey=7e5qmwl9tbuoa0967kq3uq65f&st=g3wxulnc&dl=0, 
L451 (`malloc` for each string) was executed approximately 1.6 million times. After that, L445 failed to allocate 26MB of continuous heap memory.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to allocate memory #345

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to allocate memory #345

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions