Add ov falback to CPU machinisim & verified with OV without ROPE support#42
Open
zhaixuejun1993 wants to merge 264 commits intoravi9:dev_backend_openvinofrom
Open
Add ov falback to CPU machinisim & verified with OV without ROPE support#42zhaixuejun1993 wants to merge 264 commits intoravi9:dev_backend_openvinofrom
zhaixuejun1993 wants to merge 264 commits intoravi9:dev_backend_openvinofrom
Conversation
…f consecutive OPs
…/ADD adjacent op graph conversion
…ted individually. 2. VIEW op output tensor shape is not same with CONT(non-contiguous) input tensor shape 3. CPY(non-contiguous) can't be implemented with original input/output tensor shape and data(need change the original shape when create input/output tensor) Currently. VIEW op executed in the ggml backend and others executed in the OpenVINO Frontend.
2. Remove duplicate get node operation function
…ode needs to be integrated into the OV Frontend 2. In the predict latest token stage, the VIEW, CONT, Reshape need to be integrated into the OV Frontend.
6d71ded to
900dd76
Compare
Fix for stateful accuracy issues and cl_out_of_resources error in stateful GPU with larger context sizes.
76e4057 to
e73b4d4
Compare
996b739 to
b6c83aa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enabling the OpenVINO backend fallback to Llama.cpp CPU backend mechanisms.
Below is a summary of the main process:
Dynamic Dimension Computation
Function: compute_cgraph_dynamic_dims()
Purpose: Determines the dynamic dimensions for each node in the computation graph. This is essential for handling nodes with variable shapes during runtime.
Process:
Traverses the computation graph.
Assigns dynamic dimension indices to nodes based on their operation type and dependencies.
Handles specific operations like [GGML_OP_VIEW], [GGML_OP_RESHAPE], and others to propagate dynamic dimensions.
Adding Extra Model Outputs
Function: add_extra_model_outputs_for_fallback()
Purpose: Ensures that all relevant nodes in the computation graph are included as model outputs for fallback scenarios.
Process:
Maps tensor data addresses to their corresponding nodes, excluding [GGML_OP_VIEW] nodes.
Adds nodes to the [m_model_outputs] map if they are not already present.
Adding Extra Model Inputs
Function: add_extra_model_inputs_for_fallback()
Purpose: Ensures that all necessary input nodes are included as model inputs for fallback scenarios.
Process:
Iterates through the source nodes of each computation graph node.
Skips nodes already in [m_model_weights] or [m_model_inputs].
Excludes intermediate nodes from [m_node_info_list].
Creates OpenVINO parameter nodes for eligible source nodes and updates the [m_inputs] and [m_model_inputs] maps.