Skip to content

Pod Reading #157

@mbauer288

Description

@mbauer288

So I'm trying to read a podded dataframe.

fetch_n_pod: Starting STARE Podding
    Reading '/Users/mbauer/tmp/data/POMD/discover/202001/DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.nc4' and '/Users/mbauer/tmp/data/POMD/discover/DYAMONDv2_stare.nc' at level 4 from /Users/mbauer/tmp/data/pods
    
    The call:
        pod_root   = '/Users/mbauer/tmp/data/pods'
        sids_hex   = ['0x0000000000000004', '0x0008000000000004',  ...  '0x3ff0000000000004', '0x3ff8000000000004']
        tcover_tid = 2274394778256809073

        podded_sdf = starepandas.read_pods(pod_root=pod_root, sids=sids_hex, tids=[tcover_tid])

            STAREPandas.read_pods() -> starepandas/io/pod.py:read_pods():    
                path_format                = '{pod_root}{delim1}{sid}'
                pattern                    = '*'
                temporal_pattern           = '{pod_path}(.*)-.*'
                temporal_pattern_tid_index = 0
                tids_cmp                   = array([2274394778256809073])

                for sid in sids:
                    pod_path   = '/Users/mbauer/tmp/data/pods/0x0000000000000004'
                    pickles    = ['/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.pkl.bz2', 
                                  ...
                                  '/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f906377a5001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200213_2330z.pkl.bz2']
                    search     = '.*.*'
                    pods       = ['/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.pkl.bz2', 
                                  ...
                                  '/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f906377a5001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200213_2330z.pkl.bz2']
                    regexp     = '/Users/mbauer/tmp/data/pods/0x0000000000000004(.*)-.*'
                    m.groups() = ('/0x1f90200025001c71-DYAMONDv2_PE3600x1800',)

                    So 'pickles' correctly lists all the pod files by sid and tid; '/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71*'

                    But 'pods = list(filter(re.compile(search).match, pickles))' throws an RE.error over search = '.**.*', which I guess is the repeated '**'

                          File "/Users/mbauer/SpatioTemporal/STAREPandas/starepandas/io/pod.py", line 195, in read_pods
                            pods = list(filter(re.compile(search).match, pickles))
                                               ^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/__init__.py", line 227, in compile
                            return _compile(pattern, flags)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/__init__.py", line 294, in _compile
                            p = _compiler.compile(pattern, flags)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_compiler.py", line 743, in compile
                            p = _parser.parse(p, flags)
                                ^^^^^^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 982, in parse
                            p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 457, in _parse_sub
                            itemsappend(_parse(source, state, verbose, nested + 1,
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 687, in _parse
                            raise source.error("multiple repeat",
                        re.error: multiple repeat at position 2

                    Changing the line to 
                        search = '.*{pattern}.*'.format(pattern=pattern) if pattern != "*" else '.*.*'
                    goes further but then I get another error.
 
                        Traceback (most recent call last):
                          File "/Users/mbauer/SpatioTemporal/STAREPandas/starepandas/io/pod.py", line 220, in read_pods
                            tid_ = int(m.groups()[temporal_pattern_tid_index],16)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                        ValueError: invalid literal for int() with base 16: '/0x1f90200025001c71-DYAMONDv2_PE3600x1800'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions