Skip to content

Making use of available parallelism within pyfive #208

@bnlawrence

Description

@bnlawrence

Other I/O libraries use internal parallelism to speed up access for a given data selection. Each chunk can be read independently, and decompressed independently. Unfortunately there is not one right way to do this, and we can't impose (for example) asyncio on parent processes we don't know much about, and if we don't we have the gil to worry about.

A possible strategy is to exploit upstream paralellism. For example, if pyfive is reading a file object which on fsspec, there is considerable upstream asyncio available, and if not, we might want to use internal threading (which doesn't play badly with a calling process).

To that end, we should try a dispatch strategy which investigates what kind of file is in play.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions