Question
Are there possible factors to consider when the performance during reproduction does not meet expectations?
Currently, I suspect the following variables may be affecting the results:
- Camera model (using RealSense D435i)
- System latency
- Motion blur in input images
- Camera height (default mounting position)
- Checkpoint ("InternVLA-N1-DualVLN")
- Differences in texture and lighting conditions
Among these (or other possible factors), which ones are known to have the most significant impact on performance?
Additionally, are there any critical setup details (e.g., camera calibration, frame rate, resolution, synchronization, etc.) that are easy to overlook but important for successful reproduction?