将文本分成几部分

问题描述 投票:0回答:4

我想将一个大文本分成 10 部分(某种程度上是相等的部分)。 我使用这个功能:

<?php
function chunk($msg) {
   $msg = preg_replace('/[\r\n]+/', ' ', $msg);
   //define character length of each text piece
   $chunks = wordwrap($msg, 10000, '\n');
   return explode('\n', $chunks);
}

$arrayys=chunk($t);
foreach($arrayys as $partt){echo $partt."<br/><br/><br/>";}
?>

但是是否可以定义每个文本片段的字长(而不是字符长度)?在这种情况下如何将文本分成单词

php text split divide
4个回答
1
投票

我建议使用“爆炸” http://php.net/manual/en/function.explode.php 用于按空格分割字符串。然后,您将获得一个单词数组,您可以在其上迭代并构建文本部分。


1
投票

来自文档,

<?php
$text = "ABCDEFGHIJK.";
$newtext = wordwrap($text,3,"\n",true);
echo "$newtext\n";
?>

输出:

ABC DEF GHI JK.


1
投票

你可以做这样的事情。将文本分成相等的部分。

$str
中的文本有 20 个字符,因此文本被分成 10 个部分,其中 2 个字符为一组。

假设您的大文本有 1000 个字符,那么您将获得 100 个相等的文本部分。

<?php
$div=10;//Equally split into 10 ...
$str="abcdefghijklmnopqrst";
print_r(array_chunk(str_split($str), (strlen($str)/($div))));

输出:

Array
(
    [0] => Array
        (
            [0] => a
            [1] => b
        )

    [1] => Array
        (
            [0] => c
            [1] => d
        )

    [2] => Array
        (
            [0] => e
            [1] => f
        )

    [3] => Array
        (
            [0] => g
            [1] => h
        )

    [4] => Array
        (
            [0] => i
            [1] => j
        )

    [5] => Array
        (
            [0] => k
            [1] => l
        )

    [6] => Array
        (
            [0] => m
            [1] => n
        )

    [7] => Array
        (
            [0] => o
            [1] => p
        )

    [8] => Array
        (
            [0] => q
            [1] => r
        )

    [9] => Array
        (
            [0] => s
            [1] => t
        )

)

0
投票
  • 找到文本中每个“单词”的偏移量,
  • 计算单词数,然后除以 10 以确定每组所需的单词数,
  • 隔离每组的第一个偏移量,
  • 提取一组的第一个单词偏移量和下一组的第一个单词偏移量之间的原始文本片段。

代码:(演示

$offsets = array_keys(str_word_count($text, 2));
$totalPerGroup = intdiv(count($offsets), 10);
$chunks = array_chunk($offsets, $totalPerGroup);
$starts = array_column($chunks, 0);
var_export(
    array_map(
        fn($start, $end) => substr($text, $start, $end ? $end - $start : $end),
        $starts,
        array_slice($starts, 1) + [null]
    )
);

输入示例:

$text = <<<TEXT
The answer was within her reach. It was hidden in a box and now that box sat directly in front of her. She'd spent years searching for it and could hardly believe she'd finally managed to find it. She turned the key to unlock the box and then gently lifted the top. She held her breath in anticipation of finally knowing the answer she had spent so much of her time in search of. As the lid came off she could see that the box was empty.
TEXT;

输出:

array (
  0 => 'The answer was within her reach. It was ',
  1 => 'hidden in a box and now that box ',
  2 => 'sat directly in front of her. She\'d spent ',
  3 => 'years searching for it and could hardly believe ',
  4 => 'she\'d finally managed to find it. She turned ',
  5 => 'the key to unlock the box and then ',
  6 => 'gently lifted the top. She held her breath ',
  7 => 'in anticipation of finally knowing the answer she ',
  8 => 'had spent so much of her time in ',
  9 => 'search of. As the lid came off she ',
  10 => 'could see that the box was empty.',
)

当然,要删除尾随空格,请将

substr()
包含在
rtrim()
调用中。

© www.soinside.com 2019 - 2024. All rights reserved.