So I'm trying to read a podded dataframe.
fetch_n_pod: Starting STARE Podding
Reading '/Users/mbauer/tmp/data/POMD/discover/202001/DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.nc4' and '/Users/mbauer/tmp/data/POMD/discover/DYAMONDv2_stare.nc' at level 4 from /Users/mbauer/tmp/data/pods
The call:
pod_root = '/Users/mbauer/tmp/data/pods'
sids_hex = ['0x0000000000000004', '0x0008000000000004', ... '0x3ff0000000000004', '0x3ff8000000000004']
tcover_tid = 2274394778256809073
podded_sdf = starepandas.read_pods(pod_root=pod_root, sids=sids_hex, tids=[tcover_tid])
STAREPandas.read_pods() -> starepandas/io/pod.py:read_pods():
path_format = '{pod_root}{delim1}{sid}'
pattern = '*'
temporal_pattern = '{pod_path}(.*)-.*'
temporal_pattern_tid_index = 0
tids_cmp = array([2274394778256809073])
for sid in sids:
pod_path = '/Users/mbauer/tmp/data/pods/0x0000000000000004'
pickles = ['/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.pkl.bz2',
...
'/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f906377a5001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200213_2330z.pkl.bz2']
search = '.*.*'
pods = ['/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200115_0000z.pkl.bz2',
...
'/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f906377a5001c71-DYAMONDv2_PE3600x1800-DE.tavg_30mn.prectot.20200213_2330z.pkl.bz2']
regexp = '/Users/mbauer/tmp/data/pods/0x0000000000000004(.*)-.*'
m.groups() = ('/0x1f90200025001c71-DYAMONDv2_PE3600x1800',)
So 'pickles' correctly lists all the pod files by sid and tid; '/Users/mbauer/tmp/data/pods/0x0000000000000004/0x1f90200025001c71*'
But 'pods = list(filter(re.compile(search).match, pickles))' throws an RE.error over search = '.**.*', which I guess is the repeated '**'
File "/Users/mbauer/SpatioTemporal/STAREPandas/starepandas/io/pod.py", line 195, in read_pods
pods = list(filter(re.compile(search).match, pickles))
^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/__init__.py", line 227, in compile
return _compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/__init__.py", line 294, in _compile
p = _compiler.compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_compiler.py", line 743, in compile
p = _parser.parse(p, flags)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 982, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 457, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mbauer/miniconda3/envs/stare/lib/python3.11/re/_parser.py", line 687, in _parse
raise source.error("multiple repeat",
re.error: multiple repeat at position 2
Changing the line to
search = '.*{pattern}.*'.format(pattern=pattern) if pattern != "*" else '.*.*'
goes further but then I get another error.
Traceback (most recent call last):
File "/Users/mbauer/SpatioTemporal/STAREPandas/starepandas/io/pod.py", line 220, in read_pods
tid_ = int(m.groups()[temporal_pattern_tid_index],16)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 16: '/0x1f90200025001c71-DYAMONDv2_PE3600x1800'
So I'm trying to read a podded dataframe.