Research on HTML Element Retrieval Methods Based on innerText

Keywords: JavaScript | DOM Manipulation | innerText | textContent | Element Retrieval

Abstract: This paper comprehensively explores multiple methods for retrieving HTML elements based on text content in JavaScript, with focus on core DOM traversal implementation and comparative analysis of XPath queries versus modern ES6 syntax. Through detailed code examples and performance analysis, it provides practical solution selection guidelines for front-end developers.

Introduction

In modern web development, dynamically locating HTML elements based on text content is a common requirement. Whether building automated test scripts, implementing content search functionality, or developing dynamic interactive interfaces, efficient and accurate retrieval of DOM elements containing specific text is essential. This paper systematically analyzes several mainstream implementation methods based on practical development experience.

Core Method: Manual DOM Traversal

The most direct and optimally compatible approach is manual traversal of the DOM tree through JavaScript. The core concept involves obtaining all target tag elements and then individually checking whether their text content matches the search criteria.

The complete implementation code is as follows:

var aTags = document.getElementsByTagName("a");
var searchText = "SearchingText";
var found;

for (var i = 0; i < aTags.length; i++) {
  if (aTags[i].textContent == searchText) {
    found = aTags[i];
    break;
  }
}

// Use the found element

This method offers the following advantages: compatibility with all major browsers, including older IE versions; clear and understandable implementation logic; good performance in small to medium-sized DOM trees. It is important to note that the textContent property is used here instead of innerText, as the former is more stable and not affected by CSS styles.

Alternative Approaches Comparison

XPath Query Method

XPath provides a more declarative query approach, particularly suitable for complex document structures:

var xpath = "//a[text()='SearchingText']";
var matchingElement = document.evaluate(xpath, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;

For partial matching scenarios, the contains function can be used:

var xpath = "//a[contains(text(),'Searching')]";

The advantage of XPath lies in its strong query expressiveness, but drawbacks include relatively poor browser compatibility and potentially inferior performance compared to direct traversal in certain scenarios.

Modern ES6 Syntax

With the widespread adoption of ECMAScript 6, more concise functional programming styles can be employed:

for (const a of document.querySelectorAll("a")) {
  if (a.textContent.includes("your search term")) {
    console.log(a.textContent)
  }
}

Alternatively, using array method chaining:

[...document.querySelectorAll("a")]
   .filter(a => a.textContent.includes("your search term"))
   .forEach(a => console.log(a.textContent))

This approach offers concise code but requires modern browser support, necessitating transpilation through tools like Babel for older browsers.

Difference Between textContent and innerText

When selecting text comparison properties, understanding the distinction between textContent and innerText is crucial. textContent retrieves the plain text content of an element and all its descendant nodes, ignoring any styling and layout information. In contrast, innerText considers the rendered style of the element, omits content of hidden elements, and accounts for text transformations influenced by CSS.

For example, given an HTML structure containing <br> elements and hidden text:

<span id="text">
Take a look at<br />
how this text<br />
is interpreted below.
</span>
<span style="display:none">HIDDEN TEXT</span>

textContent will return the complete content including all line breaks and hidden text, while innerText returns only the actually rendered text, ignoring hidden elements and properly handling line break effects from <br> tags.

Performance Optimization Recommendations

In practical applications, performance is a critical consideration:

For large documents, use more specific selectors to narrow the search scope
If text searches are performed frequently, consider building text indexes
In performance-sensitive scenarios, manual traversal is typically faster than XPath
Using === strict equality comparison can improve comparison efficiency

Conclusion

Multiple implementation approaches exist for retrieving HTML elements based on text content, each with its applicable scenarios. The manual DOM traversal method offers optimal compatibility and predictability, suitable for most production environments. XPath provides powerful query capabilities ideal for complex document structures. Modern ES6 syntax offers more elegant code writing styles. Developers should select the appropriate solution based on specific project requirements for browser compatibility, performance needs, and code maintainability.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.