Keywords: MySQL | Date Query | URL Statistics | UNIX Timestamp | Conditional Aggregation
Abstract: This article provides an in-depth exploration of efficiently querying yesterday's data and performing URL access statistics in MySQL. Through analysis of core technologies including UNIX timestamp processing, date function applications, and conditional aggregation, it details the complete solution using SUBDATE to obtain yesterday's date, utilizing UNIX_TIMESTAMP for time range filtering, and implementing conditional counting via the SUM function. The article includes comprehensive SQL code examples and performance optimization recommendations to help developers master the implementation of complex data statistical queries.
Introduction
In practical web application development and data analysis, there is often a need to statistically analyze data within specific time ranges. Based on a typical URL access statistics scenario, this article provides a detailed explanation of how to query yesterday's data and perform multi-dimensional URL access statistics in MySQL.
Problem Background and Data Model
Assume we have a URL access records table containing two core fields: URL (storing the visited website addresses) and DateVisited (storing the access time as UNIX timestamps). Sample data is as follows:
URL: google.com youtube.com google.com youtube.com test.com youtube.com
DateVisited: 1313668492 1313668540 1313668571 13154314The business requirement is to count the total occurrences of each URL in the table and the number of times it was visited yesterday, with the expected output format being:
LINK | timesExisted | timesVisitedYesterday
google.com | 2 | 2
youtube.com| 3 | 3Core Technology Analysis
Methods for Obtaining Yesterday's Date
There are multiple methods to obtain yesterday's date in MySQL, with the most concise and effective being the use of the SUBDATE function:
SELECT SUBDATE(CURRENT_DATE, 1);Alternatively, the equivalent using the DATE_SUB function:
SELECT DATE_SUB(CURDATE(), INTERVAL 1 DAY);Both methods accurately return yesterday's date value without the time component. CURRENT_DATE and CURDATE() are functionally identical, both returning the current date.
UNIX Timestamp Range Processing
Since the DateVisited field in the data table stores UNIX timestamps, we need to convert date ranges into timestamp ranges for querying. The timestamp range for yesterday can be calculated as follows:
UNIX_TIMESTAMP(SUBDATE(CURRENT_DATE, 1)) -- Yesterday's start timestamp
UNIX_TIMESTAMP(CURRENT_DATE) -- Today's start timestamp (i.e., yesterday's end timestamp)This gives us the complete 24-hour timestamp range for yesterday.
Conditional Aggregation and Counting Techniques
A useful feature of MySQL is that boolean values are converted to 1 (TRUE) or 0 (FALSE) in numerical contexts. Leveraging this characteristic, we can use the SUM function to directly sum conditional expressions, thereby achieving counting of records that meet the condition:
SUM(DateVisited BETWEEN start_timestamp AND end_timestamp)This method is more concise and efficient than traditional CASE WHEN statements.
Complete Solution
Based on the above technical analysis, the complete SQL query statement is as follows:
SELECT
url AS LINK,
COUNT(*) AS timesExisted,
SUM(DateVisited BETWEEN UNIX_TIMESTAMP(SUBDATE(CURRENT_DATE, 1))
AND UNIX_TIMESTAMP(CURRENT_DATE)) AS timesVisitedYesterday
FROM mytable
GROUP BY url;Let's analyze this query section by section:
SELECT url AS LINK: Selects the URL field and renames it as LINKCOUNT(*) AS timesExisted: Counts the total occurrences of each URL in the tableSUM(DateVisited BETWEEN...): Counts yesterday's visits; the conditional expression returns 1 or 0, and SUM aggregates these valuesFROM mytable: Specifies the source data tableGROUP BY url: Groups statistics by URL
Performance Optimization Considerations
For tables with large data volumes, it is recommended to create a composite index on the url and DateVisited fields:
CREATE INDEX idx_url_date ON mytable(url, DateVisited);This can significantly improve the performance of grouping queries and range queries.
Extended Applications
The same technical pattern can be extended to statistics for other time ranges, such as:
-- Count visits in the last 7 days
SUM(DateVisited >= UNIX_TIMESTAMP(SUBDATE(CURRENT_DATE, 7)))
-- Count visits this month
SUM(DateVisited BETWEEN UNIX_TIMESTAMP(DATE_FORMAT(CURRENT_DATE, '%Y-%m-01'))
AND UNIX_TIMESTAMP(LAST_DAY(CURRENT_DATE)))Conclusion
By appropriately utilizing MySQL's date functions, UNIX timestamp conversions, and conditional aggregation techniques, we can efficiently implement complex time-range data statistics. The solution provided in this article not only addresses specific business problems but, more importantly, demonstrates a universal SQL programming pattern that can be widely applied to various time-series data statistical analysis scenarios.