Skip to content

Consolidate completion check #288

@TimidRobot

Description

@TimidRobot

Problem

There are multiple completion checks (individual fetch scripts, shared.check_completion_file_exists). Not all of them support --force.

Description

I expect a single shared function with appropriate arguments can satisfy all completion checks

  • probably two arguments:
    • args - argparse Namespace
    • files - dictionary: {file_path: minimum lines}
  • Shouldn't check without --enable-save
  • No need to parse file with csv to count lines
  • --force needs to be supported by scripts that use it.
        parser.add_argument(
            "--force",
            action="store_true",
            help="Generate new output files even if they already exist",
        )
    • All scripts that support --force should have same argument stanza and arguments should be ordered normally
  • args.quarter man need to be added to scripts that use it
  • Recommend a minimum of two PRs:
    1. for scripts that use shared.check_completion_file_exists
    2. scripts that have their own completion check

Additional context

I worked on this briefly before giving up due to scope of changes, different priorites, etc. The following function should be a good starting place:

def check_for_completion(args, files):
    """ 
    Check if files exist and have at least the specified number of lines (if
    specified). Can be overrideend on the command line with the --force option.
    If fiels exist and have minimum number of specified lines, the script exits
    early by raising a QuantifyingException with an exit status code of 0.
    """
    if not args.enable_save or args.force:
        return
    all_files_exist = True
    all_files_complete = True
    for path, minimum_lines in files.items():
        if not os.path.exists(path):
            all_files_exist = False
            print(f"{path} does not exist")
        elif minimum_lines is not None and minimum_lines > 0:
            with open(path, "r", encoding="utf-8") as file_obj:
                reader = csv.DictReader(file_obj, dialect="unix")
                if len(list(reader)) < minimum_lines:
                    all_files_complete = False
                    print(f"{path} has too few lines")
    if all_files_exist and all_files_complete:
        raise QuantifyingException(
            "All output files are already present and appear complete for"
            f" {args.quarter}",
            0,
        )

Implementation

  • I would be interested in implementing this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions