add code and documentation to grant schemas to groups#50

Open

lhoesly wants to merge 3 commits intomainfrom

Contributor

lhoesly commented Feb 25, 2026

Python code to manage schema-based / group-based grants for data controls


          add code and documentation to grant schemas to groups

dc95324

lhoesly requested a review from Copilot

February 25, 2026 19:38

Copilot started reviewing on behalf of lhoesly

February 25, 2026 19:39

Copilot AI reviewed

View reviewed changes

Copilot AI left a comment

Pull request overview

This PR introduces a Python automation script and documentation for managing Redshift schema-based access grants to groups. The script automates granting SELECT permissions (and optionally USAGE) on database schemas to specified groups, including both existing and future tables/views. This addresses the complexity of Redshift's hierarchical permission model and the limitation that new tables don't automatically inherit schema-level permissions.

Changes:

New Python script (automate_schema_grants.py) that generates and executes SQL GRANT statements based on a configuration mapping schemas to groups
Comprehensive README documentation explaining the purpose, configuration, and usage of the schema grants automation
Support for configurable options via environment variables (dry run mode, USAGE grants, future table grants)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File	Description
python/grant_schemas/automate_schema_grants.py	Main automation script that generates SQL GRANT statements for schemas and groups, with support for existing and future tables
python/grant_schemas/README.md	Comprehensive documentation covering overview, configuration, usage examples, and Redshift-specific considerations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/grant_schemas/automate_schema_grants.py

+                      if schema and groups:
+                          mapping[schema] = {
+                              "schema": schema,
+                              "groups": tuple(groups),

Copilot AI Feb 25, 2026

Converting groups from a list to a tuple is unnecessary here. The groups are only used for iteration in the SQL generation and converting to a tuple doesn't provide any functional benefit. This could be simplified by keeping groups as a list, which would be more consistent with the input format and the table_creators field which remains a list.

Suggested change

      
                            "groups": tuple(groups),
          
                            "groups": groups,

Copilot uses AI. Check for mistakes.

mayamkay Mar 3, 2026

I feel like 4 lines to say drop the tuple is a bit much

python/grant_schemas/automate_schema_grants.py

+              def main(database, dry_run=True, grant_usage=False, grant_future=True):
+                  grant_commands = []
+                  schema_grants = get_schema_grants_config()

Copilot AI Feb 25, 2026

The script does not validate that SCHEMA_GRANTS_CONFIG contains at least one valid configuration entry before proceeding. If the configuration is empty or all entries are invalid (missing schema or groups), the script will generate an empty query string and potentially execute it, which could be confusing. Consider adding a check after get_schema_grants_config() to ensure there is at least one valid schema configuration, and raise a clear error if not.

Suggested change

      
                schema_grants = get_schema_grants_config()
          
                schema_grants = get_schema_grants_config()
          
                if not schema_grants:
          
                    LOG.error(
          
                        "No valid schema grant configurations found. "
          
                        "SCHEMA_GRANTS_CONFIG must contain at least one entry with a "
          
                        "'schema_name' and a non-empty 'groups' list."
          
                    )
          
                    raise ValueError(
          
                        "Invalid configuration: SCHEMA_GRANTS_CONFIG must contain at least "
          
                        "one schema with associated groups."
          
                    )

Copilot uses AI. Check for mistakes.

mayamkay Mar 3, 2026

Not a bad idea but probably not necessary

python/grant_schemas/automate_schema_grants.py Outdated

+                      groups = schema_grants[schema_name]["groups"]
+                      if grant_usage:
+                          usage_command = f"GRANT USAGE ON SCHEMA {schema} TO GROUP {', GROUP '.join(groups)};"

Copilot AI Feb 25, 2026

SQL injection vulnerability: The schema and group names are directly interpolated into SQL commands without validation or sanitization. Malicious or malformed schema/group names could execute arbitrary SQL. Add input validation to ensure schema and group names only contain alphanumeric characters, underscores, and optionally hyphens. Use a whitelist pattern like ^[a-zA-Z0-9_]+$ to validate these identifiers before constructing SQL statements.

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py Outdated

+                      # Grant on existing tables and views
+                      # Note: In Redshift, "ALL TABLES" includes tables, views, and external tables
+                      select_command = f"GRANT SELECT ON ALL TABLES IN SCHEMA {schema} TO GROUP {', GROUP '.join(groups)};"

Copilot AI Feb 25, 2026

SQL injection vulnerability: The schema and group names are directly interpolated into SQL commands without validation or sanitization. Malicious or malformed schema/group names could execute arbitrary SQL. Add input validation to ensure schema and group names only contain alphanumeric characters, underscores, and optionally hyphens.

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py Outdated

+                              # Grant for each specified table creator
+                              for creator in table_creators:
+                                  for group in groups:
+                                      future_command = f"ALTER DEFAULT PRIVILEGES FOR USER {creator} IN SCHEMA {schema} GRANT SELECT ON TABLES TO GROUP {group};"

Copilot AI Feb 25, 2026

SQL injection vulnerability: The creator usernames and schema/group names are directly interpolated into SQL commands without validation or sanitization. Malicious usernames could execute arbitrary SQL. Add input validation to ensure usernames, schema names, and group names only contain alphanumeric characters, underscores, and optionally hyphens.

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py

Comment on lines +127 to +128

		future = civis.io.query_civis(query, database=database, hidden=False)
		LOG.info(future.result())

Copilot AI Feb 25, 2026

No error handling for the query execution. If the SQL execution fails (due to insufficient permissions, invalid schema names, or database connectivity issues), the script will crash without providing actionable feedback. Consider wrapping the query execution in a try-except block and logging specific error information to help users troubleshoot issues.

Suggested change

      
                    future = civis.io.query_civis(query, database=database, hidden=False)
          
                    LOG.info(future.result())
          
                    try:
          
                        future = civis.io.query_civis(query, database=database, hidden=False)
          
                        result = future.result()
          
                        LOG.info(result)
          
                    except Exception as exc:
          
                        LOG.error(
          
                            "Failed to execute grant commands on database '%s': %s",
          
                            database,
          
                            exc,
          
                            exc_info=True,
          
                        )
          
                        raise

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py

Comment on lines +72 to +73

		mapping[schema] = {
		"schema": schema,

Copilot AI Feb 25, 2026

The 'schema' field in the returned dictionary duplicates the 'schema_name' key. Since the dictionary is keyed by schema name, storing 'schema': schema_name is redundant. Consider removing this field and using the dictionary key directly when accessing the schema name.

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py Outdated

Comment on lines +131 to +136

+              if __name__ == "__main__":
+                  # Different Platform/cloud environments use slightly different formats for Boolean parameters;
+                  # This provides some assurance that "truthy" values are assigned properly.
+                  DRY_RUN_PARAM = strtobool(str(os.environ.get('DRY_RUN', 'True')))
+                  GRANT_USAGE = strtobool(str(os.environ.get('GRANT_USAGE', 'False')))
+                  GRANT_FUTURE = strtobool(str(os.environ.get('GRANT_FUTURE', 'True')))

Copilot AI Feb 25, 2026

The strtobool function will raise a ValueError if the environment variable contains an invalid boolean string (e.g., 'invalid', 'maybe'). This error will be unhelpful to users who may not know what valid boolean strings are. Consider wrapping these conversions in a try-except block with a clearer error message that indicates which environment variable has an invalid value and what valid values are (e.g., 'true', 'false', '1', '0', 'yes', 'no').

Suggested change

      
            if __name__ == "__main__":
          
                # Different Platform/cloud environments use slightly different formats for Boolean parameters;
          
                # This provides some assurance that "truthy" values are assigned properly.
          
                DRY_RUN_PARAM = strtobool(str(os.environ.get('DRY_RUN', 'True')))
          
                GRANT_USAGE = strtobool(str(os.environ.get('GRANT_USAGE', 'False')))
          
                GRANT_FUTURE = strtobool(str(os.environ.get('GRANT_FUTURE', 'True')))
          
            def _get_bool_env(var_name, default):
          
                """
          
                Read a boolean environment variable using distutils.util.strtobool and provide
          
                a clear error message if the value is invalid.
          
                """
          
                raw_value = os.environ.get(var_name, default)
          
                try:
          
                    # strtobool returns 0 or 1; convert to a proper bool.
          
                    return bool(strtobool(str(raw_value)))
          
                except ValueError as exc:
          
                    raise ValueError(
          
                        f"Invalid value for environment variable {var_name}: {raw_value!r}. "
          
                        "Expected one of 'y', 'n', 'yes', 'no', 't', 'f', 'true', 'false', "
          
                        "'on', 'off', '1', or '0'."
          
                    ) from exc
          
            if __name__ == "__main__":
          
                # Different Platform/cloud environments use slightly different formats for Boolean parameters;
          
                # This provides some assurance that "truthy" values are assigned properly.
          
                DRY_RUN_PARAM = _get_bool_env('DRY_RUN', 'True')
          
                GRANT_USAGE = _get_bool_env('GRANT_USAGE', 'False')
          
                GRANT_FUTURE = _get_bool_env('GRANT_FUTURE', 'True')

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py Outdated

+                              # No table_creators specified - grant for the current user running the script
+                              # This will only apply to tables created by this user!
+                              for group in groups:
+                                  future_command = f"ALTER DEFAULT PRIVILEGES IN SCHEMA {schema} GRANT SELECT ON TABLES TO GROUP {group};"

Copilot AI Feb 25, 2026

SQL injection vulnerability: The schema and group names are directly interpolated into SQL commands without validation or sanitization. Add input validation to ensure schema names and group names only contain alphanumeric characters, underscores, and optionally hyphens.

Copilot uses AI. Check for mistakes.

python/grant_schemas/automate_schema_grants.py

Comment on lines +24 to +25

		from distutils.util import strtobool

Copilot AI Feb 25, 2026

The distutils module is deprecated and will be removed in Python 3.12. Consider using a more modern alternative such as checking if the string is in a list of accepted true values (e.g., ['true', '1', 'yes']), or using a library like python-dotenv that handles environment variable parsing.

Suggested change

      
            from distutils.util import strtobool
          
            def strtobool(val):
          
                """
          
                Convert a string representation of truth to True or False.
          
                Accepted true values are: 'y', 'yes', 't', 'true', 'on', '1'.
          
                Accepted false values are: 'n', 'no', 'f', 'false', 'off', '0'.
          
                Raises ValueError if 'val' is anything else.
          
                """
          
                val = str(val).strip().lower()
          
                if val in ("y", "yes", "t", "true", "on", "1"):
          
                    return True
          
                if val in ("n", "no", "f", "false", "off", "0"):
          
                    return False
          
                raise ValueError(f"invalid truth value {val!r}")

Copilot uses AI. Check for mistakes.

lhoesly and others added 2 commits

February 25, 2026 15:06


          flake8

ea4adc1


          linting

0a0ff9a

mayamkay approved these changes

View reviewed changes

mayamkay left a comment

I think some of the copilot suggestions are reasonable but I don't think any of them are totally necessary so if you'd rather just merge as is I'm fine with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet