Google Drive Voice Manager lets users search, read, summarize, expand, and save documents in Google Drive using natural voice commands. It handles OAuth, token refresh, intent classification, routing, and Drive API interaction in a structured, fault-tolerant way.
- Search files by name
- Search inside document content
- Read and summarize a document (PDFs not supported)
- Expand (go deeper into) the current document
- List recently modified files
- Browse folder contents
- Set a default notes folder
- Save a quick note to Drive as a Google Doc
All search results are cached for follow-up selections like “the second one” or partial name matches.
All user input flows through a single classifier (classify_trigger_context) that returns structured JSON.
{ "mode": "...", "search_query": "...", "file_reference": "...", "folder_name": "...", "note_content": "...", "file_type": "doc|sheet|slides|pdf|any" }
The dispatcher routes strictly based on mode. Handlers are isolated and deterministic.
Before invoking the classifier:
- Ordinal / partial matches resolve from
recent_results - “Go deeper” shortcuts expand the active document
- Exit words immediately terminate
This reduces unnecessary LLM calls and keeps behavior predictable.
All Google requests go through drive_request():
- Token refresh before every call
- Retry-on-401 once
- Scope-based invalidation on 403
timeout=10enforcedfieldsparameter always usedtrashed=falsealways included
No handler talks directly to requests.
- Stored in
gdrive_manager_prefs.json - Delete-then-write pattern
_session_keys never persisted- Refresh + access tokens stored locally
Before the Google Drive Voice Manager can access a user’s Drive, it must be authorized via OAuth 2.0 using a Google Cloud project.
This ability uses the Google Drive API with offline access (refresh tokens enabled).
The OAuth flow:
- User creates OAuth credentials in Google Cloud Console.
- User pastes Client ID and Client Secret into the assistant.
- Assistant generates a consent URL.
- User authorizes access in browser.
- Assistant exchanges authorization code for:
- access_token
- refresh_token
- Tokens are stored in
gdrive_manager_prefs.json. - All future API calls automatically refresh tokens when needed.
Scope used:
https://www.googleapis.com/auth/drive
This grants full Drive access for search, read, and document creation.
Go to:
https://console.cloud.google.com
Create a new project or select an existing one.
Note: creating a new project does not automatically switch to it in GCC. If you're using a new project, make sure you click on your current project at the top of the screen to open the project picker, then select the new project.
Navigate (by clicking on the navigation menu on the left side of your screen) to:
APIs & Services → Library
Search for:
Google Drive API
Click Enable.
Navigate to:
APIs & Services → OAuth Consent Screen
- Choose External
- Fill in:
- App name
- User support email
- Save
Then:
- Go to Audience (on your left in the OAuth menu, below the button that opens the general navigation menu)
- Click Add Users
- Add your Google account email as a test user
Navigate to:
APIs & Services → Credentials
Click:
Create Credentials → OAuth client ID
Select:
Application Type: Desktop App
Create it.
Copy:
- Client ID
- Client Secret
You will paste these into the assistant during setup.
Once credentials are provided:
- The assistant generates a consent URL.
- User opens the link and signs in.
- After approval, Google redirects to:
This will fail — that is expected.
- Copy the value after:
code=
Stop at the first &.
- Paste that code back into the assistant.
The assistant exchanges the authorization code at:
https://oauth2.googleapis.com/token
It receives:
- access_token
- refresh_token
- expires_in
The refresh token is required for persistent access.
If Google does not return a refresh token, revoke app access in your Google account settings and retry.
Tokens are stored in:
gdrive_manager_prefs.json
Stored fields:
- client_id
- client_secret
- refresh_token
- access_token
- token_expires_at
- user_email
Session-only fields (e.g., currently opened document) are never persisted.
Before every Drive API request:
- The assistant checks token_expires_at
- If expired (or within 60 seconds of expiry):
- It refreshes the access token automatically
- If refresh fails with
invalid_grantorinvalid_client: - Stored tokens are invalidated
- User must re-run OAuth setup
- Missing refresh token → OAuth setup required
- 401 Unauthorized → Refresh token + retry once
- 403 insufficient scopes → Tokens invalidated
- Expired token → Auto refresh
- Corrupt prefs file → File deleted + reset
- OAuth is performed locally.
- Redirect URI used: http://localhost:1
access_type=offlineandprompt=consentensure refresh token issuance.- Credentials are stored locally via the SDK file system.
- No tokens are logged.
If authorization expires or is revoked:
- Tokens are invalidated.
- On next run, OAuth setup automatically restarts.
No manual cleanup is required.
Only one API must be enabled:
- Google Drive API
No Gmail or additional APIs are required.
https://www.googleapis.com/auth/drive
This scope enables:
- File search
- Content export
- Folder browsing
- Document creation (Quick Save)
If setup completes successfully, the assistant confirms with:
Connected! I can see your Drive.
You are now ready to search, read, and save files using voice.
- Update the classifier schema (
classify_trigger_context) - Add mode handling in
dispatch() - Implement
_run_<mode>() - Ensure:
- No raw
requestscalls (usedrive_request) - No direct token logic
resume_normal_flow()is never called inside handlers- All Drive queries include
trashed=false fieldsparameter is minimal
- No raw
Add logic inside _conversation_loop before classification.
Keep shortcuts short and explicit — never ambiguous.
Always:
- Use MIME filtering when relevant
- Cap
pageSize - Use relative timestamps via
_format_relative_time - Cache
recent_resultsif follow-up selection should work
Examples:
- “Drive”
- “Google Drive”
- “Check my Drive”
- “Search my Drive”
- “Open Drive”
Hotwords should clearly indicate Drive intent and avoid generic collisions.
Note: It is highly recommended that you change the default trigger words of the live web search ability, otherwise you are likely to accidentally invoke it when directing the assistant to navigate your drive.
- Refresh token likely invalid
- Re-run OAuth setup
- If no refresh token returned, revoke the app in Google Account settings and retry
Check your permissions in Google Cloud Console.
- Network issue
- Expired token that failed refresh
- Invalid client credentials
- Query may be content-based but classified as name search
- Try explicitly: “Search inside documents for …”
PDF export is not supported for voice summarization.
If multiple folders match, clarify by saying the full name or “the first one.”
This ability is designed to be:
- Deterministic where possible
- LLM-driven at the intent layer and summarization layer
- Safe in token lifecycle management
- Easy to extend without breaking architecture