

Therefore, this study proposed a scale-invariant loss so that the predicted scale of XYZ map can conform to the original 2D input scale. This method can be applied, for example, to the task of a robot arm taking containers and pouring liquid. For example, in, a method is proposed to predict the liquid or solid in transparent vessels to XYZ maps. Therefore, 2D to 3D reconstruction technologies applicable to different application scenarios will be quite different. The technology of reconstructing 3D models from 2D images has many practical applications. The experimental results measured by CD and EMD have shown that 3D-SSRecNet outperforms the state-of-the-art reconstruction methods. In order to verify the effectiveness of 3D-SSRecNet, we conducted a series of experiments on ShapeNet and Pix3D datasets.

In order to generate point clouds with better shape and appearance, in the point cloud prediction network, the exponential linear unit (ELU) is used as the activation function, and the joint function of chamfer distance (CD) and Earth mover’s distance (EMD) is used as the loss function of 3DSSRecNet. DetNet can extract more details from 2D images. The 2D image feature extraction network takes DetNet as the backbone. The single-stage network structure can reduce the loss of the extracted 2D image features.

The proposed 3D-SSRecNet is a simple single-stage network composed of a 2D image feature extraction network and a point cloud prediction network. In this paper, a single-stage and single-view 3D point cloud reconstruction network, 3D-SSRecNet, is proposed. Existing research often pays more attention to the structure of the point cloud generation network, while ignoring the feature extraction of 2D images and reducing the loss in the process of feature propagation in the network. It is a challenging problem to infer objects with reasonable shapes and appearance from a single picture.
