Making use of available parallelism within pyfive

Other I/O libraries use internal parallelism to speed up access for a given data selection. Each chunk can be read independently, and decompressed independently.  Unfortunately there is not one right way to do this, and we can't impose (for example) asyncio on parent processes we don't know much about, and if we don't we have the gil to worry about.

A possible strategy is to exploit upstream paralellism. For example, if pyfive is reading a file object which on fsspec, there is considerable upstream asyncio available, and if not, we might want to use internal threading (which doesn't play badly with a calling process).

To that end, we should try a dispatch strategy which investigates what kind of file is in play.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making use of available parallelism within pyfive #208

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Making use of available parallelism within pyfive #208

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions