Keywords: MySQL | latitude longitude calculation | spherical distance | ST_Distance_Sphere | geographic information systems
Abstract: This article provides an in-depth exploration of various methods for calculating the spherical distance between two geographic coordinate points in MySQL databases. It begins with the traditional spherical law of cosines formula and its implementation details, including techniques for handling floating-point errors using the LEAST function. The discussion then shifts to the ST_Distance_Sphere() built-in function available in MySQL 5.7 and later versions, presenting it as a more modern and efficient solution. Performance optimization strategies such as avoiding full table scans and utilizing bounding box calculations are examined, along with comparisons of different methods' applicability. Through practical code examples and theoretical analysis, the article offers comprehensive technical guidance for developers.
Introduction and Problem Context
In geographic information systems (GIS) and location-based services, calculating the spherical distance between two geographic coordinate points is a common requirement. MySQL, as a widely used relational database management system, offers multiple approaches to implement this functionality. This article is based on a typical scenario: suppose there is a data table containing city information with their latitude and longitude coordinates, and users need to query the precise distance between two specific cities.
Traditional Approach: Spherical Law of Cosines Formula
Before MySQL 5.7, calculating spherical distances typically relied on direct implementation of mathematical formulas. The most commonly used is the Spherical Law of Cosines, with the basic formula:
distance = R * acos( sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) * cos(lon2 - lon1) )
where R is the Earth's radius (approximately 6371 km on average), and latitude and longitude need to be converted to radians. The specific implementation in MySQL is as follows:
SELECT a.city AS from_city, b.city AS to_city,
111.111 *
DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.Latitude))
* COS(RADIANS(b.Latitude))
* COS(RADIANS(a.Longitude - b.Longitude))
+ SIN(RADIANS(a.Latitude))
* SIN(RADIANS(b.Latitude))))) AS distance_in_km
FROM locations AS a
JOIN locations AS b ON a.id <> b.id
WHERE a.city = 3 AND b.city = 7
This query obtains coordinates of two cities through a self-join, then applies the formula to calculate the distance. The constant 111.111 approximately represents the number of kilometers per degree of latitude (based on Earth's circumference of 40000 km divided by 360 degrees). To calculate distance in statute miles, replace the constant with 69.0.
Key Implementation Details and Optimization
In practical applications, several important details require attention:
Floating-Point Error Handling: The argument of the ACOS function must be within the range [-1,1]. Due to floating-point calculation inaccuracies, when two points are very close, the cosine value may slightly exceed 1, causing calculation errors. Using LEAST(1.0, expression) ensures the argument does not exceed 1.
Performance Considerations: When needing to find all points within a specific distance range, directly using distance calculation in the HAVING clause results in full table scans with poor performance. A better approach is to first use a bounding box for preliminary filtering:
WHERE Latitude BETWEEN lat - radius AND lat + radius
AND Longitude BETWEEN lon - radius AND lon + radius
Then perform precise distance calculations on the filtered results, significantly reducing computational load.
Modern Solution: ST_Distance_Sphere Function
MySQL 5.7 introduced spatial extension features, including the ST_Distance_Sphere() function specifically designed for calculating spherical distances. This function uses more accurate algorithms and directly returns distance in meters:
SELECT ST_Distance_Sphere(
POINT(a.Longitude, a.Latitude),
POINT(b.Longitude, b.Latitude)
) AS distance_in_meters
FROM locations a, locations b
WHERE a.city = 3 AND b.city = 7
Advantages of this function include:
- Higher calculation accuracy using more advanced spherical distance algorithms
- Better performance optimization with internal database enhancements
- Direct return of standard units (meters) without manual conversion
- More concise and readable code
Method Comparison and Selection Recommendations
The two main methods have different applicable scenarios:
<table border="1"> <tr><th>Method</th><th>Advantages</th><th>Disadvantages</th><th>Applicable Scenarios</th></tr> <tr><td>Spherical Law of Cosines</td><td>Compatible with older MySQL versions, transparent and customizable principles</td><td>Requires manual floating-point error handling, relatively complex code</td><td>MySQL versions below 5.7, highly customized calculation needs</td></tr> <tr><td>ST_Distance_Sphere</td><td>Accurate calculation, performance optimized, concise code</td><td>Requires MySQL 5.7+, depends on spatial extensions</td><td>Modern MySQL environments, standard distance calculation requirements</td></tr>Practical Application Example
Suppose we need to create a user search feature that allows users to find other users within a specific distance from their city. Combining bounding box optimization, a complete query can be implemented as follows:
-- First create a stored function to simplify distance calculation
DELIMITER //
CREATE FUNCTION CalculateDistance(
lat1 DOUBLE, lon1 DOUBLE,
lat2 DOUBLE, lon2 DOUBLE
) RETURNS DOUBLE DETERMINISTIC
BEGIN
RETURN ST_Distance_Sphere(
POINT(lon1, lat1),
POINT(lon2, lat2)
) / 1000; -- Convert to kilometers
END //
DELIMITER ;
-- Actual query: Find all cities within 50 km of city 3
SET @search_lat = (SELECT Latitude FROM locations WHERE city = 3);
SET @search_lon = (SELECT Longitude FROM locations WHERE city = 3);
SET @radius_km = 50;
SET @lat_range = @radius_km / 111.111;
SET @lon_range = @radius_km / (111.111 * COS(RADIANS(@search_lat)));
SELECT city, CalculateDistance(@search_lat, @search_lon, Latitude, Longitude) AS distance_km
FROM locations
WHERE city <> 3
AND Latitude BETWEEN @search_lat - @lat_range AND @search_lat + @lat_range
AND Longitude BETWEEN @search_lon - @lon_range AND @search_lon + @lon_range
HAVING distance_km <= @radius_km
ORDER BY distance_km;
Conclusion and Best Practices
When calculating geographic distances in MySQL, it is recommended to prioritize using the ST_Distance_Sphere() function, provided the environment supports MySQL 5.7 or later. For older systems, the spherical law of cosines remains a viable alternative, but floating-point error handling must be addressed. Regardless of the method chosen, bounding box optimization should be incorporated to enhance query performance, especially when processing large datasets. In practical applications, consider encapsulating frequently used distance calculations as stored functions to improve code reusability and maintainability.