NVIDIA's open-source 3D project LLaMA-Mesh

Project Overview

LLaMA-Mesh is NVIDIA's open-source 3D project, aiming to bring large language models (LLMs) into the field of 3D mesh generation. By representing 3D meshes as text and fine-tuning the model, LLMs are enabled with the ability to understand and generate 3D meshes. This method unifies 3D and textual modalities within the same model while retaining language processing capabilities, opening up new possibilities for dialogue-based 3D creation and mesh understanding.

Core Functions

: Directly generate complex 3D meshes through linguistic descriptions, enhancing the intuitiveness and efficiency of 3D modeling.
: Leverage the semantic understanding capabilities of LLMs to analyze existing 3D meshes, providing support for intelligent 3D asset analysis and management.

Key advantages

: Acquiring prior knowledge from text sources such as 3D tutorials, providing a unique advantage for 3D mesh generation.
: 3D mesh generation and understanding tasks can be completed through natural language dialogue.

Method Introduction

LLaMA-Mesh adopts a unified format, converting numerical values of vertex coordinates and face definitions into plain text and conducts end-to-end training on data alternating between text and 3D meshes. The fine-tuned model can not only generate high-quality 3D meshes but also maintain strong text generation and comprehension abilities.

Trial

Currently available for trial on Hugging Face: https://huggingface.co/spaces/Zhengyi/LLaMA-Mesh