如何从Vispy的屏幕坐标获得世界坐标?

时间:2023-01-05 21:19:47

I am not sure how to get from screen coordinates to world coordinates. I am using VisPy and I would like to implement ray tracing and picking ability in 3D.

我不知道如何从屏幕坐标到世界坐标。我正在使用VisPy,我想在3D中实现光线追踪和拾取能力。

I prepared some code based on a cube example. The code below sends a crude ray through the screen by changing z value and prints 3D coordinates (in ''on_mouse_press '' method). However the results are not correct. If i click top right corner of the cube somewhere along the ray should be printed (3,3,3), but it's not. Can anybody help me with this?

我准备了一些基于多维数据集示例的代码。下面的代码通过改变z值向屏幕发送一条粗线,并打印3D坐标(在“on_mouse_press”方法中)。但是结果并不正确。如果我点击立方体的右上角沿着射线应该被打印出来(3,3)但是它不是。有人能帮我一下吗?

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# vispy: gallery 50
"""
This example shows how to display 3D objects.
You should see a colored outlined spinning cube.
"""

import numpy as np
from vispy import app, gloo
from vispy.util.transforms import perspective, translate, rotate

vert = """
// Uniforms
// ------------------------------------
uniform   mat4 u_model;
uniform   mat4 u_view;
uniform   mat4 u_projection;
uniform   vec4 u_color;

// Attributes
// ------------------------------------
attribute vec3 a_position;
attribute vec4 a_color;
attribute vec3 a_normal;

// Varying
// ------------------------------------
varying vec4 v_color;

void main()
{
    v_color = a_color * u_color;
    gl_Position = u_projection * u_view * u_model * vec4(a_position,1.0);
}
"""


frag = """
uniform mat4 u_model;
uniform mat4 u_view;
uniform mat4 u_normal;

uniform vec3 u_light_intensity;
uniform vec3 u_light_position;

varying vec3 v_position;
varying vec3 v_normal;
varying vec4 v_color;

void main()
{
    gl_FragColor = v_color;
}
"""


# -----------------------------------------------------------------------------
def cube(num_of_cubes):
    """
    Build vertices for a colored cube.

    V  is the vertices
    I1 is the indices for a filled cube (use with GL_TRIANGLES)
    I2 is the indices for an outline cube (use with GL_LINES)
    """

    for i in range(0,num_of_cubes):
        # Vertices positions
        v = np.array([[1, 1, 1], [-1, 1, 1], [-1, -1, 1], [1, -1, 1],
             [1, -1, -1], [1, 1, -1], [-1, 1, -1], [-1, -1, -1]],dtype=np.float32)

        v[:,0]=v[:,0]+2.
        v[:,1]=v[:,1]+2.
        v[:,2]=v[:,2]+2.

        # Face Normals
        n =np.array([[0, 0, 1], [1, 0, 0], [0, 1, 0],
             [-1, 0, 1], [0, -1, 0], [0, 0, -1]],dtype=np.float32)
        # Vertice colors
        c = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 1, 1],
             [0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 1, 1]],dtype=np.float32)

        V_aux = np.array([(v[0], n[0], c[0]), (v[1], n[0], c[1]),
                      (v[2], n[0], c[2]), (v[3], n[0], c[3]),
                      (v[0], n[1], c[0]), (v[3], n[1], c[3]),
                      (v[4], n[1], c[4]), (v[5], n[1], c[5]),
                      (v[0], n[2], c[0]), (v[5], n[2], c[5]),
                      (v[6], n[2], c[6]), (v[1], n[2], c[1]),
                      (v[1], n[3], c[1]), (v[6], n[3], c[6]),
                      (v[7], n[3], c[7]), (v[2], n[3], c[2]),
                      (v[7], n[4], c[7]), (v[4], n[4], c[4]),
                      (v[3], n[4], c[3]), (v[2], n[4], c[2]),
                      (v[4], n[5], c[4]), (v[7], n[5], c[7]),
                      (v[6], n[5], c[6]), (v[5], n[5], c[5])]
        )
        I1_aux = np.resize(np.array([0, 1, 2, 0, 2, 3], dtype=np.uint32), 6 * (2 * 3))
        I1_aux += np.repeat(4 * np.arange(2 * 3, dtype=np.uint32), 6)

        I2_aux = np.resize(
            np.array([0, 1, 1, 2, 2, 3, 3, 0], dtype=np.uint32), 6 * (2 * 4))
        I2_aux += np.repeat(4 * np.arange(6, dtype=np.uint32), 8)


        if i==0:
            V=V_aux
            I1=I1_aux
            I2=I2_aux
        else:
            V=np.vstack((V,V_aux))
            I1=np.vstack((I1,I1_aux+i*24))
            I2=np.vstack((I2,I2_aux+i*24))




    return V, I1, I2


# -----------------------------------------------------------------------------
class Canvas(app.Canvas):

    def __init__(self):
        app.Canvas.__init__(self, keys='interactive', size=(800, 600))

        num_of_cubes=1 #number of cubes to draw
        self.V, self.filled, self.outline = cube(num_of_cubes)


        self.store_pos=np.array((0,0)) #for mouse interaction

        self.vert_data=np.vstack(self.V[:,0])
        self.V_buf=np.vstack(self.V[:,0])
        self.V_buf.dtype=[('a_position',np.float32,3)]
        self.vert_buf=gloo.VertexBuffer(self.V_buf)

        self.N_buf=np.vstack(self.V[:,1])
        self.N_buf.dtype=[('a_normal',np.float32,3)]
        self.norm_buf=gloo.VertexBuffer(self.N_buf)

        self.C_buf=np.vstack(self.V[:,2])
        self.C_buf.dtype=[('a_color',np.float32,4)]
        self.colo_buf=gloo.VertexBuffer(self.C_buf)

        self.filled_buf=gloo.IndexBuffer(self.filled.flatten())
        self.outline_buf=gloo.IndexBuffer(self.outline.flatten())

        self.program = gloo.Program(vert, frag)
        self.translate = 1

        #self.vert_buf=gloo.VertexBuffer(self.vertices.flatten())
        self.program.bind(self.vert_buf)
        self.program.bind(self.norm_buf)
        self.program.bind(self.colo_buf)


        self.view = translate((0, 0, -10))
        self.model = np.eye(4, dtype=np.float32)

        gloo.set_viewport(0, 0, self.physical_size[0], self.physical_size[1])
        self.projection = perspective(45.0, self.size[0] /
                                      float(self.size[1]), 2.0, 10.0)

        self.program['u_projection'] = self.projection

        self.program['u_model'] = self.model
        self.program['u_view'] = self.view

        self.theta = 0
        self.phi = 0

        gloo.set_clear_color('white')
        gloo.set_state('opaque')
        gloo.set_polygon_offset(1, 1)

        self._timer = app.Timer('auto', connect=self.on_timer, start=True)

        self.show()
        self.t=0


    # ---------------------------------
    def on_timer(self, event):
        self.update()

    # ---------------------------------
    def print_mouse_event(self, event, what):
        modifiers = ', '.join([key.name for key in event.modifiers])
        print('%s - pos: %r, button: %s, modifiers: %s, delta: %r' %
              (what, event.pos, event.button, modifiers, event.delta))

    def on_mouse_press(self, event):
        self.print_mouse_event(event, 'Mouse press')

        #convert to NDC
        left=event.pos[0]*2/self.size[0]-1
        bottom=(self.size[1]-event.pos[1])*2/self.size[1]-1


        z_clip=np.linspace(-1.,1.,100)
        for val in z_clip:
            aux=np.dot(np.dot(np.linalg.inv(self.view),np.linalg.inv(self.projection)),np.array((left,bottom,val,1.)))
            pos3d=aux/aux[3]
            print(pos3d)

    def on_mouse_wheel(self, event):

        self.translate -= event.delta[1]
        self.translate = max(-1, self.translate)
        self.view[3,2]=-self.translate

        self.program['u_view'] = self.view
        self.update()



    def on_draw(self, event):
        gloo.clear()

        # Filled cube

        gloo.set_state(blend=False, depth_test=True, polygon_offset_fill=True)
        self.program['u_color'] = 1, 0, 1, 1


        self.program.draw('triangles', self.filled_buf)

        # Outline
        gloo.set_state(polygon_offset_fill=False, blend=True, depth_mask=False)
        gloo.set_depth_mask(False)
        self.program['u_color'] = 0, 0, 0, 1

        self.program.draw('lines', self.outline_buf)
        gloo.set_depth_mask(True)


# -----------------------------------------------------------------------------
if __name__ == '__main__':
    c = Canvas()
    app.run()

2 个解决方案

#1


1  

A clicked point on the screen maps to a line in your scene.

屏幕上的单击点将映射到场景中的一行。

The object in view.scene.transform represents the mapping between scene and screen coordinates. .map(points) will transform points from scene to screen. .imap(points) maps from screen coordinates back to world coordinates.

view.scene的对象。transform表示场景和屏幕坐标之间的映射。.map(points)将从场景转换到屏幕。.imap(points)映射从屏幕坐标转换回世界坐标。

To get the line your screen point corresponds to. You can imap a point on the screen, and another point offset from the screen in z:

要得到你的屏幕点对应的线。你可以在屏幕上加一个点,在z中从屏幕上再加一个点偏移量:

def get_view_axis_in_scene_coordinates(view):
    import numpy
    tform=view.scene.transform
    w,h = view.canvas.size
    screen_center = numpy.array([w/2,h/2,0,1]) # in homogeneous screen coordinates
    d1 = numpy.array([0,0,1,0]) # in homogeneous screen coordinates
    point_in_front_of_screen_center = screen_center + d1 # in homogeneous screen coordinates
    p1 = tform.imap(point_in_front_of_screen_center) # in homogeneous scene coordinates
    p0 = tform.imap(screen_center) # in homogeneous screen coordinates
    assert(abs(p1[3]-1.0) < 1e-5) # normalization necessary before subtraction
    assert(abs(p0[3]-1.0) < 1e-5)
    return p0[0:3],p1[0:3] # 2 point representation of view axis in 3d scene coordinates

I adapted it be a bit closer to what you want; you need to replace screen_center with the clicked point. Note, I did this for orthogonal projection; think it works for perspective too but haven't tested it.

我把它调整得更接近你想要的;需要用单击点替换screen_center。注意,我这样做是为了正交投影;认为它也适用于透视,但还没有测试过。

Related: Get view direction relative to scene in vispy?

相关:在vispy中获取相对于场景的视图方向?

#2


0  

I am not sure about the actual code required to do this, but conceptually this is how I would go about solving this.

我不确定完成这个任务所需的实际代码,但从概念上来说,这就是我将着手解决这个问题的方法。

When clicking a pixel on the screen, you are essentially choosing an X,Y spot that is considered your viewport camera, which means that the rest of the transforms and rotation that you need is found from the camera.

当在屏幕上点击一个像素时,你实际上是在选择一个X,Y点作为你的viewport摄像头,这意味着你需要的其他变换和旋转都可以从摄像头中找到。

So really, get the positional and rotational data of the camera, add the relative x,y transform from your viewport, then draw a trace that uses the forward vector from your position that points linearly towards the point that you want. Then when that trace hits, get that object.

所以,要得到相机的位置和旋转数据,从你的viewport中添加相对x,y变换,然后绘制一条轨迹,它使用你的位置上的正向向量,指向你想要的点。当跟踪命中时,获取该对象。

If you don't add the relative transform, it would be getting the trace from the center of your viewport, so since the rotation data for all points in the viewport is the same, you just need to add the difference in x,y between where you clicked from the center x,y.

如果你不添加相对变换,它会从你的viewport的中心得到跟踪,因为viewport中的所有点的旋转数据都是相同的,你只需要添加x,y之间的差值,从中间的x,y。

Also, remember that for the viewport, "X" is really the trig value of your yaw,pitch,roll (world) or just yaw,pitch(relative), and for "Y" its your Z axis.

另外,请记住,对于viewport,“X”实际上是您的偏航、俯仰、横滚(world)或仅仅偏航、俯仰(relative)的三角值,而“Y”则是您的Z轴。

I hope my explanation was clear, I also added this picture to further demonstrate the overview. Hope this helps!

我希望我的解释是清楚的,我也添加了这张照片来进一步展示概述。希望这可以帮助!

Picture

图片

#1


1  

A clicked point on the screen maps to a line in your scene.

屏幕上的单击点将映射到场景中的一行。

The object in view.scene.transform represents the mapping between scene and screen coordinates. .map(points) will transform points from scene to screen. .imap(points) maps from screen coordinates back to world coordinates.

view.scene的对象。transform表示场景和屏幕坐标之间的映射。.map(points)将从场景转换到屏幕。.imap(points)映射从屏幕坐标转换回世界坐标。

To get the line your screen point corresponds to. You can imap a point on the screen, and another point offset from the screen in z:

要得到你的屏幕点对应的线。你可以在屏幕上加一个点,在z中从屏幕上再加一个点偏移量:

def get_view_axis_in_scene_coordinates(view):
    import numpy
    tform=view.scene.transform
    w,h = view.canvas.size
    screen_center = numpy.array([w/2,h/2,0,1]) # in homogeneous screen coordinates
    d1 = numpy.array([0,0,1,0]) # in homogeneous screen coordinates
    point_in_front_of_screen_center = screen_center + d1 # in homogeneous screen coordinates
    p1 = tform.imap(point_in_front_of_screen_center) # in homogeneous scene coordinates
    p0 = tform.imap(screen_center) # in homogeneous screen coordinates
    assert(abs(p1[3]-1.0) < 1e-5) # normalization necessary before subtraction
    assert(abs(p0[3]-1.0) < 1e-5)
    return p0[0:3],p1[0:3] # 2 point representation of view axis in 3d scene coordinates

I adapted it be a bit closer to what you want; you need to replace screen_center with the clicked point. Note, I did this for orthogonal projection; think it works for perspective too but haven't tested it.

我把它调整得更接近你想要的;需要用单击点替换screen_center。注意,我这样做是为了正交投影;认为它也适用于透视,但还没有测试过。

Related: Get view direction relative to scene in vispy?

相关:在vispy中获取相对于场景的视图方向?

#2


0  

I am not sure about the actual code required to do this, but conceptually this is how I would go about solving this.

我不确定完成这个任务所需的实际代码,但从概念上来说,这就是我将着手解决这个问题的方法。

When clicking a pixel on the screen, you are essentially choosing an X,Y spot that is considered your viewport camera, which means that the rest of the transforms and rotation that you need is found from the camera.

当在屏幕上点击一个像素时,你实际上是在选择一个X,Y点作为你的viewport摄像头,这意味着你需要的其他变换和旋转都可以从摄像头中找到。

So really, get the positional and rotational data of the camera, add the relative x,y transform from your viewport, then draw a trace that uses the forward vector from your position that points linearly towards the point that you want. Then when that trace hits, get that object.

所以,要得到相机的位置和旋转数据,从你的viewport中添加相对x,y变换,然后绘制一条轨迹,它使用你的位置上的正向向量,指向你想要的点。当跟踪命中时,获取该对象。

If you don't add the relative transform, it would be getting the trace from the center of your viewport, so since the rotation data for all points in the viewport is the same, you just need to add the difference in x,y between where you clicked from the center x,y.

如果你不添加相对变换,它会从你的viewport的中心得到跟踪,因为viewport中的所有点的旋转数据都是相同的,你只需要添加x,y之间的差值,从中间的x,y。

Also, remember that for the viewport, "X" is really the trig value of your yaw,pitch,roll (world) or just yaw,pitch(relative), and for "Y" its your Z axis.

另外,请记住,对于viewport,“X”实际上是您的偏航、俯仰、横滚(world)或仅仅偏航、俯仰(relative)的三角值,而“Y”则是您的Z轴。

I hope my explanation was clear, I also added this picture to further demonstrate the overview. Hope this helps!

我希望我的解释是清楚的,我也添加了这张照片来进一步展示概述。希望这可以帮助!

Picture

图片