From 3D to 2D: Mathematics and Implementation of Perspective Projection

Keywords: 3D Graphics | Perspective Projection | Java Programming | Homogeneous Coordinates | Matrix Transformation

Abstract: This article explores how to convert 3D points to 2D perspective projection coordinates, based on homogeneous coordinates and matrix transformations. Starting from basic principles, it explains the construction of perspective projection matrices, field of view calculation, and screen projection steps, with rewritten Java code examples. Suitable for computer graphics learners and developers to implement depth effects for models like the Utah teapot.

Introduction

In computer graphics, projecting 3D scenes onto a 2D screen is a core task. Based on the Q&A data, this article delves into the mathematical principles and implementation methods of perspective projection, using Java as the programming language. The problem originates from adding depth effects to a Utah teapot drawn with Bezier curves, transitioning from orthographic to perspective projection. We start with homogeneous coordinates, gradually build transformation matrices, and finally obtain screen coordinates.

Principles of Perspective Projection

Perspective projection simulates human vision, making distant objects appear smaller to create a sense of depth. The key tool is homogeneous coordinates, which extend 3D points [x, y, z] to [x, y, z, w], where w is typically 1. Using 4x4 matrices allows unified handling of translation, rotation, scaling, and projection. The workflow from world coordinates to screen coordinates includes: camera transformation, perspective projection, clipping, and screen mapping.

Perspective Projection Matrix Calculation

Constructing the perspective projection matrix requires parameters: field of view (FOV), aspect ratio, near and far clipping plane distances. The matrix layout is as follows, where fov = 1.0 / tan(angle/2.0), with angle in radians or degrees but consistent. Matrix elements ensure x and y are scaled based on FOV, and z is used for depth buffering.

[fov * aspectRatio][        0        ][        0              ][        0       ]
[        0        ][       fov       ][        0              ][        0       ]
[        0        ][        0        ][(far+near)/(far-near)  ][        1       ]
[        0        ][        0        ][(2*near*far)/(near-far)][        0       ]

After calculation, points are transformed via matrix multiplication, followed by perspective division: dividing x, y, z by w to convert points to clip space.

Implementation Steps

World to Camera Transformation: Use the inverse camera matrix to transform 3D points from world coordinates to camera coordinates. If normals are involved, special handling is needed to avoid translation effects.
Perspective Projection: Apply the above perspective projection matrix, then divide by w for perspective division. This step is crucial, mapping 3D points to a 2D plane while preserving depth information.
Clipping: Use algorithms like Sutherland-Hodgeman clipping to ensure points are within the viewport. This article skips it for brevity but it's important in practice.
Screen Projection: Map clipped points to screen coordinates using the formula: new_x = (x * width) / (2.0 * w) + halfWidth, new_y = (y * height) / (2.0 * w) + halfHeight.

Java Code Example

Based on understanding, the following Java code rewrites the C++ example from the Q&A, demonstrating the complete perspective projection workflow. It defines Vector and Matrix classes to handle matrix multiplication and projection.

import java.util.ArrayList;
import java.util.List;

class Vector {
    float x, y, z, w;
    Vector() { this(0, 0, 0, 1); }
    Vector(float x, float y, float z) { this(x, y, z, 1); }
    Vector(float x, float y, float z, float w) {
        this.x = x; this.y = y; this.z = z; this.w = w;
    }
    float length() {
        return (float)Math.sqrt(x*x + y*y + z*z);
    }
    Vector unit() {
        float mag = length();
        if (mag < 1e-6) throw new ArithmeticException("Division by near-zero");
        return new Vector(x/mag, y/mag, z/mag, w);
    }
    Vector divide(float scalar) {
        return new Vector(x/scalar, y/scalar, z/scalar, w/scalar);
    }
}

class Matrix {
    float[] data = new float[16];
    Matrix() { identity(); }
    void identity() {
        for (int i = 0; i < 16; i++) data[i] = 0;
        data[0] = data[5] = data[10] = data[15] = 1.0f;
    }
    void setupClipMatrix(float fov, float aspectRatio, float near, float far) {
        identity();
        float f = 1.0f / (float)Math.tan(fov * 0.5f);
        data[0] = f * aspectRatio;
        data[5] = f;
        data[10] = (far + near) / (far - near);
        data[11] = 1.0f; // plugs old z into w
        data[14] = (2.0f * near * far) / (near - far);
        data[15] = 0.0f;
    }
    Vector multiply(Vector v) {
        return new Vector(
            v.x*data[0] + v.y*data[4] + v.z*data[8] + v.w*data[12],
            v.x*data[1] + v.y*data[5] + v.z*data[9] + v.w*data[13],
            v.x*data[2] + v.y*data[6] + v.z*data[10] + v.w*data[14],
            v.x*data[3] + v.y*data[7] + v.z*data[11] + v.w*data[15]
        );
    }
}

public class PerspectiveProjection {
    static List<Vector> projectAndClip(int width, int height, float near, float far, List<Vector> vertices) {
        float halfWidth = width * 0.5f;
        float halfHeight = height * 0.5f;
        float aspect = (float)width / height;
        Matrix clipMatrix = new Matrix();
        clipMatrix.setupClipMatrix((float)Math.toRadians(60.0f), aspect, near, far);
        List<Vector> result = new ArrayList<>();
        for (Vector v : vertices) {
            Vector transformed = clipMatrix.multiply(v);
            transformed = transformed.divide(transformed.w); // perspective divide
            // clipping step omitted for brevity, should be added in practice
            transformed.x = (transformed.x * width) / (2.0f * transformed.w) + halfWidth;
            transformed.y = (transformed.y * height) / (2.0f * transformed.w) + halfHeight;
            result.add(transformed);
        }
        return result;
    }
    public static void main(String[] args) {
        // example usage: assume a list of vertices
        List<Vector> vertices = new ArrayList<>();
        vertices.add(new Vector(1, 2, 3));
        List<Vector> projected = projectAndClip(800, 600, 0.1f, 100.0f, vertices);
        System.out.println("Projected coordinates: " + projected.get(0).x + ", " + projected.get(0).y);
    }
}

Discussion and Supplement

Answer 2 in the Q&A provides a simplified formula: if the camera is at the origin, projection is X' = X * (F/Z), Y' = Y * (F/Z), where F is the focal length. This works for simple cases but lacks the flexibility of matrix transformations and clipping handling. In practical applications, such as the OpenGL specification, using homogeneous coordinates and matrices better handles complex transformations and depth buffering.

Conclusion

Perspective projection is a fundamental technique in 3D graphics, implemented via homogeneous coordinates and matrix transformations. This article details the complete workflow from 3D points to 2D screens, including mathematical principles and Java code examples. Understanding these concepts aids in developing more realistic 3D applications, such as games or simulators. Future work can extend to advanced graphics features like shading and lighting.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.