my workflow for making 3d gaussian splats

written , last updated

the vast majority of open source software for making 3d gaussian splats (3DGS) requires CUDA, making it unusable for those without an NVIDIA GPU. in the interest of helping people find alternatives, i've decided i should document my 3DGS workflow.

i use MacOS on an Apple Silicon Mac, but none of the software listed below is MacOS-specific.

my workflow

  1. (if applicable) convert inputs into a usable format:
    • HEIC to JPG: for file in *.HEIC; do magick $file -quality 100 -auto-orient $file.jpg; done
    • extract frames from video using Sharp Frames (using the manual selection mode usually gives the best results)
  2. collect all input images into one folder, making sure to remove any poor quality (blurry, noisy, obstructed, etc) images before proceeding
  3. open COLMAP or use the automated workflow
    1. File -> New project
      1. specify database location & input image folder
      2. click Save
    2. Extras -> Set options for...
      1. set dataset type to either "Individual images" or "Video frames"
      2. set quality to "Extreme"
    3. Processing -> Feature extraction
      1. choose Camera model
      2. if all images are taken with the same lens, zoom, and focal length, enable "Shared for all images"
      3. click Extract
    4. (optional) adjust camera parameters as necessary
    5. Processing -> Feature matching
      1. if image count is <1000, use Exhaustive matching
      2. if dataset type is "Individual images" and image count is >1000, use VocabTree matching
        1. download the appropriate faiss vocabulary tree for your dataset size from COLMAP release 3.11.1
          • 256k words = 1,000 - 10,000 images
          • 1M words = 10,000 - 100,000 images
        2. set vocab_tree_path to the downloaded file
      3. if dataset type is "Video frames" and image count is >1000, use Sequential matching
      4. click Run
      5. if VocabTree or Sequential matching was used, do the following after matching is complete:
        1. use Transitive matching
        2. click Run
    6. Reconstruction -> Start reconstruction
    7. analyze the sparse reconstructed scene (see COLMAP Graphical User Interface)
      • fix misaligned images or incorrectly placed points by removing problematic images
      • aim to have good coverage of the scene and >30% overlap between images
      • if you end up capturing additional images to improve coverage, do the following:
        1. add the newly captured images to the input image folder
        2. repeat COLMAP steps 3-5
        3. Reconstruction -> Reset reconstruction
        4. repeat COLMAP steps 6-7
      • if you end up finding any problematic images, do the following:
        1. remove all problematic images from the input image folder
        2. close COLMAP
        3. delete the project database
        4. repeat the COLMAP workflow from the beginning
      • (optional) if the images are only problematic due to incorrect placement, you can try the following:
        1. move the images outside of the input image folder
        2. close COLMAP
        3. delete the project database
        4. repeat the first 5 steps of the COLMAP workflow from the beginning
        5. add the problematic images back into the input image folder
        6. repeat COLMAP steps 3-4
        7. create a plaintext file containing image pairs to match (see COLMAP Feature Matching and Geometric Verification). each line should contain two filenames separated by a space
          • for every newly added image, aim to match it to all images in the input folder which overlap with it by at least 20%
        8. Processing -> Feature matching
          1. use Custom matching
          2. set type to "Image pairs"
          3. set match_list_path to the file created in the previous step
          4. click Run
        9. proceed with the remaining steps in the COLMAP workflow
          • if the images are still positioned incorrectly, remove any problematic image pairs from the list and then repeat these steps as necessary
          • if you cannot get the images positioned correctly within a reasonable number of attempts, perform the usual steps to exclude problematic images
    8. (optional) Reconstruction -> Reset reconstruction
    9. (optional) Reconstruction -> Start reconstruction
      • running reconstruction twice can improve image undistortion quality
    10. Extras -> Undistortion
      1. specify output folder
      2. click Undistort
  4. open Brush or use the automated workflow (see also: parameter testing workflow)
    1. click Directory and choose output folder from last step
    2. (optional) set Training -> steps to 50000
    3. set Training -> Max Splats Cap to 100000k (arbitrary values can be entered into the textbox next to the slider)
      • keep in mind that this is only a cap; it doesn't determine how many splats are used unless you hit it.
    4. (optional) set Training -> Growth & refinement -> Growth stop iteration to 20000
      • same caveats as Training -> steps apply here
    5. set Model -> Spherical Harmonics Degree based on the scene
      • SH degree >3 is unsupported by almost all splat viewing/editing software
      • if your scene has good coverage, the default (3) usually delivers the best quality
      • if your scene has poor coverage and lacks reflective objects, lowering the SH degree will usually improve visual quality.
    6. enable Model -> Render mode
    7. set Model -> Render mode to Mip
    8. set Dataset -> Max image resolution to 4096
    9. click Start
  5. (optional) run the LOD workflow (recommended if splat count > 1M)
  6. open SuperSplat Editor or use splat-transform to perform bulk edits
    1. click File -> Import and choose output PLY file from last step
    2. edit splat as necessary (see SuperSplat User Guide)
    3. File -> Export -> PLY
    • note: when using splat-transform to apply rotations determined using SuperSplat Editor, -r 0,0,180 must be added at the start of your processing chain
  7. final PLY file can be converted into distribution formats using splat-transform
    • format conversion
      • splat-transform $1.ply --filter-nan --iterations 20 $1.sog
    • streaming LOD generation
      • splat-transform $1.LOD-0.ply -l 0 $1.LOD-1.ply -l 1 $1.LOD-2.ply -l 2 $1.LOD-3.ply -l 3 ${1}_SSOG/lod-meta.json --filter-nan --iterations 20
        
        zip -9rX --suffixes .webp ${1}_SSOG.zip ${1}_SSOG/* # only necessary if uploading to SuperSplat
        
    • collision voxel generation
      • splat-transform $1.ply --filter-cluster $1.voxel.json
      • make sure to use SuperSplat Viewer's collision visualization (requires a WebGPU-compatible browser and the ?webgpu URL parameter) to verify that voxels were generated correctly
    • note: when configuring SuperSplat Viewer, make sure to enable anti-aliased rendering using the ?aa URL parameter

on my hardware (Apple M4 Pro w/ 48GB RAM), i find that it usually takes 2-8 hours to turn input images to a finished splat.

automated COLMAP workflow (replaces step 3)

run the following bash script within the input folder:

set -euo pipefail

ITEM_COUNT="$(ls -1q | wc -l)"
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words32K.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words32K.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words32K.bin
fi
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words256K.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin
fi
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words1M.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin
fi
colmap project_generator --quality extreme --output colmap.ini
set +u
if [ ! -z "$1" ]; then
	sed -i '' -e "s/camera_model=SIMPLE_RADIAL/camera_model=$1/g" colmap.ini
fi
set -u
sed -i '' -e 's/max_extra_param=1$/max_extra_param=1.7976931348623157e+308/g' colmap.ini # https://github.com/colmap/colmap/blob/c03cb50eb0f565e0b0ec34dbbfe37568b31979be/src/colmap/controllers/option_manager.cc#L90
sed -i '' -e 's/database_path=/database_path=project.db/g' colmap.ini
sed -i '' -e 's/vocab_tree_path=/vocab_tree_path=vocab_tree.bin/g' colmap.ini
sed -i '' -e 's/image_path=/image_path=./g' colmap.ini
echo 'output_path=sparse' | cat - colmap.ini > colmap_mapper.ini
echo 'input_path=sparse/0
output_path=colmap_output' | cat - colmap.ini > colmap_undistorter.ini
mkdir sparse colmap_output

colmap feature_extractor --project_path colmap.ini
if [ "$ITEM_COUNT" -lt "1000" ]; then
	colmap exhaustive_matcher --project_path colmap.ini
else
	if [ "$ITEM_COUNT" -lt "10000" ]; then
		cp ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin vocab_tree.bin
	else
		cp ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin vocab_tree.bin
	fi
	colmap vocab_tree_matcher --project_path colmap.ini
	colmap transitive_matcher --project_path colmap.ini
	rm vocab_tree.bin
fi
colmap view_graph_calibrator --project_path colmap.ini # slightly improves image undistortion quality
colmap mapper --project_path colmap_mapper.ini
if [ -f sparse/1 ]; then
	echo 'input_path=sparse/merged
output_path=sparse/refined' | cat - colmap.ini > colmap_adjuster.ini
	for item in sparse/*; do
		if [ "$item" != "sparse/0" ]; then
			mkdir sparse/merged sparse/refined
			colmap model_merger --input_path1 sparse/0 --input_path2 $item --output_path sparse/merged
			colmap bundle_adjuster --project_path colmap_adjuster.ini
			rm -r sparse/0 $item sparse/merged
			mv sparse/refined sparse/0
		fi
	done
	rm colmap_adjuster.ini
fi
colmap image_undistorter --project_path colmap_undistorter.ini
rm colmap_mapper.ini colmap_undistorter.ini

colmap gui --import_path sparse/0 --database_path project.db --image_path . # (recommended) analyze the sparse reconstructed scene before continuing on to further steps
rm -r sparse
rm colmap.ini project.db*

output directory is ./colmap_output

optional arguments:

preview workflow

in some cases, it may be worth running a short "preview" quality COLMAP workflow to rapidly check scene coverage.

to preview scene coverage, run the following bash script within the input folder:

set -euo pipefail

ITEM_COUNT="$(ls -1q | wc -l)"
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words32K.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words32K.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words32K.bin
fi
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words256K.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin
fi
if [ ! -f ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin ]; then
	curl -L https://github.com/colmap/colmap/releases/download/3.11.1/vocab_tree_faiss_flickr100K_words1M.bin -o ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin
fi
colmap project_generator --quality low --output colmap.ini
set +u
if [ ! -z "$1" ]; then
	sed -i '' -e "s/camera_model=SIMPLE_RADIAL/camera_model=$1/g" colmap.ini
fi
set -u
sed -i '' -e 's/database_path=/database_path=project.db/g' colmap.ini
sed -i '' -e 's/vocab_tree_path=/vocab_tree_path=vocab_tree.bin/g' colmap.ini
sed -i '' -e 's/image_path=/image_path=./g' colmap.ini
echo 'output_path=sparse' | cat - colmap.ini > colmap_mapper.ini
mkdir sparse

colmap feature_extractor --project_path colmap.ini
if [ "$ITEM_COUNT" -lt "1000" ]; then
	cp ~/.cache/vocab_tree_faiss_flickr100K_words32K.bin vocab_tree.bin
elif [ "$ITEM_COUNT" -lt "10000" ]; then
	cp ~/.cache/vocab_tree_faiss_flickr100K_words256K.bin vocab_tree.bin
else
	cp ~/.cache/vocab_tree_faiss_flickr100K_words1M.bin vocab_tree.bin
fi
colmap vocab_tree_matcher --project_path colmap.ini
colmap global_mapper --project_path colmap_mapper.ini
rm colmap_mapper.ini vocab_tree.bin
mv -f colmap.ini sparse/0/project.ini

colmap gui --import_path sparse/0 --database_path project.db --image_path .
rm -r sparse
rm project.db*

automated Brush workflow (replaces step 4)

run the following bash script within the COLMAP output folder:

set -euo pipefail

DIR="$(pwd)"
cd ..
brush --total-train-iters 50000 --max-splats 100000000 --growth-stop-iter 20000 --render-mode mip --max-resolution 4096 --with-viewer "$DIR"

Brush workflow for testing parameter changes

  1. open Rerun Viewer
    1. click rerun -> Settings...
    2. set General -> Memory budget to 4 GiB
    3. exit the Settings menu
  2. open Brush
    1. click Directory and choose the COLMAP output folder
    2. adjust parameters as desired
    3. enable Dataset -> Split dataset for evaluation
      • the effectiveness of the train/eval split at measuring generalization (the visual quality of your splat when viewed from novel perspectives) depends heavily on your dataset
    4. enable Rerun.io -> Enable rerun
    5. click Start
  3. analyze logged values in Rerun Viewer and/or output checkpoints (written every 5k steps by default) as desired
    • SSIM and PSNR are the two logged measures of visual quality (both are higher = better)
    • once you have decided on your parameters, repeating the splat training run without a train/eval split will usually improve visual quality (at the cost of far fewer available metrics)

LOD workflow

high-quality LOD PLYs can be generated using the following bash script (requires splat-transform and Brush):

set -euo pipefail

cp $1 $2/init.ply
cp $1 ${2}.LOD-0.ply
mv ${2}_exports ${2}_train_exports

n=0
while [ "$n" -lt 3 ]; do
	n=$(( n + 1 ))

	splat-transform $2/init.ply -F 50% $2/decimated.ply # splat-transform has much higher quality decimation than brush
	rm $2/init.ply
	mv $2/decimated.ply $2/init.ply
	brush --total-train-iters 5000 --render-mode mip --growth-stop-iter 0 --max-resolution 4096 $2
	mv ${2}_exports/export_5000.ply ${2}.LOD-${n}.ply
done

rm -r $2/init.ply ${2}_exports
mv ${2}_train_exports ${2}_exports

required arguments:

if necessary, LOD PLYs can be rotated and scaled using the following bash script:

set -euo pipefail

mkdir ${1}_adjusted
for file in $1.LOD-*.ply; do
	splat-transform $file --rotate 0,0,180 --rotate $2 --scale $3 ${1}_adjusted/$file
done

required arguments:

how gaussian splats work:

example datasets for testing workflows:

capturing techniques:

software documentation and source code: