Free de bike 2022-05-14 14:40:53 阅读数:690
In development , Truncating a string is a common operation . stay PHP in , Truncating strings is very convenient , Use mb_substr The function can .
But this is only for ordinary strings , If you want to truncate a band HTML The rich text string of the tag , You can't simply use this function .
Most of the HTML Labels are in pairs , We can't truncate between a pair of tags , You can't truncate the label itself , Otherwise there will be problems .
To solve this problem , I use the DOMDocument This class ( Need to install libxml
Expand ) To achieve HTML String truncation , The code is as follows :
<?php
class HtmlText
{
private static function iterateDOMNodes(DOMNode $domNode, callable $callable)
{
foreach ($domNode->childNodes as $node) {
$callable($node);
if($node->hasChildNodes()) {
static::iterateDOMNodes($node, $callable);
}
}
}
/** * Cut off zone HTML String of tags * * @param string $string belt HTML String of tags * @param int $limit The number of words to be truncated * @param array $option Options * - ellipsis: Omit the symbol , The default value is ... * - strip_attr: Whether to remove the label attribute attribute , The default value is false * * @return string Truncated string * * @throws DOMException */
public static function truncate(string $string, int $limit, array $option = []): string
{
$default = [
'ellipsis' => '...',
'strip_attr' => false,
];
$option = $option + $default;
$oriDoc = new DOMDocument();
// Convert the encoding of the string to HTML-ENTITIES, Prevent Chinese miscoding
$convertedString = mb_convert_encoding($string, 'HTML-ENTITIES', 'UTF-8');
@$oriDoc->loadHTML($convertedString, LIBXML_HTML_NODEFDTD);
$newDoc = new DOMDocument();
$newDocXPath = new DOMXPath($newDoc);
static::iterateDOMNodes($oriDoc, function($oriNode) use ($newDoc, $newDocXPath, $option, &$limit) {
if ($limit <= 0) {
return;
}
switch ($oriNode->nodeType) {
case XML_TEXT_NODE:
$oriNodeVal = $oriNode->nodeValue;
if (preg_match('/^[\s\xa0]+$/u', $oriNodeVal)) {
return;
}
$oriNodeVal = str_replace(["\r\n", "\n"], '', $oriNodeVal);
// The leading and trailing white space characters are not included in the number of intercepted words
if (preg_match('/^([\s\xa0]*)(\S.*\S)([\s\xa0]*)$/u', $oriNodeVal, $matched)) {
$preBlank = $matched[1];
$strNeedCut = $matched[2];
$sufBlank = $matched[3];
} else {
$preBlank = '';
$strNeedCut = $oriNodeVal;
$sufBlank = '';
}
$strLength = mb_strlen($strNeedCut);
if ($strLength >= $limit) {
$strNeedCut = mb_substr($strNeedCut, 0, $limit);
$strNeedCut = "{
$strNeedCut}{
$option['ellipsis']}";
$sufBlank = '';
}
$limit -= $strLength;
$tmp = "{
$preBlank}{
$strNeedCut}{
$sufBlank}";
$newNode = $newDoc->createTextNode($tmp);
break;
case XML_ELEMENT_NODE:
$newNode = $newDoc->createElement($oriNode->nodeName);
if (!$option['strip_attr'] && $oriNode->hasAttributes()) {
foreach ($oriNode->attributes as $attr) {
$newAttr = new DOMAttr($attr->nodeName, $attr->nodeValue);
$newNode->setAttributeNode($newAttr);
}
}
break;
default:
return;
}
if (!$newDoc->hasChildNodes()) {
$newDoc->appendChild($newNode);
} else {
$oriParentNodePath = pathinfo($oriNode->getNodePath(), PATHINFO_DIRNAME);
$parentNode = $newDocXPath->query($oriParentNodePath)->item(0);
$parentNode->appendChild($newNode);
}
});
$newString = html_entity_decode($newDoc->saveHTML());
// Remove automatically added labels
$checkTags = ['html', 'body', 'head', 'p'];
foreach ($checkTags as $tag) {
if (stripos($string, "<$tag>") === false
&& stripos($newString, "<$tag>") !== false) {
$newString = str_replace(["<$tag>", "</$tag>"], '', $newString);
}
}
return $newString;
}
}
Use cases 1:
<?php
$text =<<<EOT <div> <script src="jquery-2.1.1.min.js"></script> <p style="color: red;"> <a href="#"> abed, I see a silver light </a> </p> <p> Suspected frost on the ground </p> <img src="jquery-2.1.1.min.js" alt=""/> <h2> look at the bright moon </h2> </div> EOT;
// Intercept 8 A word
echo HtmlText::truncate($text, 8);
Output :
<div>
<script src="jquery-2.1.1.min.js"></script>
<p style="color: red;">
<a href="#"> abed, I see a silver light </a>
</p>
<p> Suspiciously ...</p>
</div>
just 8 A word ,HTML The tag is not truncated , And automatically spliced at the end ...
Ellipsis , normal .
Use cases 2:
$text = ' This is an unlabeled text ';
echo HtmlText::truncate($text, 8); // Output : This is a paragraph without a label ...
It can also be used without HTML Tag text
Use cases 3:
$text = ' This is a paragraph with only one <p> Tag text ';
echo HtmlText::truncate($text, 8); // Output :<p> This is a paragraph with only one ...</p>
forehead … There is something wrong with this use case , In the middle of the <p>
The label is missing , And automatically add a pair on both sides p
label , But the problem is not big …
After testing , The test results of most use cases are normal , Has been able to meet my use , There are some small problems that can be fixed later .
in addition , Open source framework cakephp
A truncation is also provided HTML Rich text function class ,Text Class truncate Method , For details, see github.
版权声明:本文为[Free de bike]所创,转载请带上原文链接,感谢。 https://qdmana.com/2022/134/202205141436001109.html