PHPword阅读.doc问题

问题描述 投票:0回答:0

我正在尝试在 php 中读取 .doc 文件(我正在使用 laravel v6)。为了阅读 .doc,我正在使用 phpword library。它适用于英文的 .doc 文件。但我来自斯洛伐克,我们的字母表中有 á、é、í、č、ň、ô 等字符。那些角色我有问题。

我的代码:

protected static function doc_to_text( $filename )
    {
        $objReader = IOFactory::createReader('MsDoc');
        $phpWord = $objReader->load($filename); // instance of \PhpOffice\PhpWord\PhpWord

        $text = '';

        foreach ($phpWord->getSections() as $section) {
            foreach ($section->getElements() as $element) {
                if ($element instanceof Text) {
                    $text .= $element->getText();
                }
            }
        }
        return $text;
    }

这是斯洛伐克语 .doc 的函数输出:

"}\x01IVOTOPIS Titul, menoKontaktné údaje:Ulica, stoTelefón: 0xx/ xxx xxx Mobil: 09xx xxx xxx e-mail: \x13 HYPERLINK "mailto:[email protected]" \[email protected]\x15 Dosiahnuté vzdelanie: Vysokoa\x01kolské/stredoa\x01kolskéVzdelanie: 2000-2006 Fakulta/ univerzita1995-2000 stredná aDoplH\x01ujúce informácie o vzdelaní: 1998-2000 kurzy 1996-1997 a\x01tudijné pobytyPracovné skúsenosti: 2000-2004 zamestnávate>\x01, pozícia 2004-2006 zamestnávate> pozícia Jazykové znalosti: Anglický jazyk - aktívne 8:|<U+0094><U+009E>¶ÔÖþ\x16\vB\vZ\v<U+0086>\v
<U+008A>\v®\vä\v\x10B\x10´\x10Ö\x10\x1E\x11F\x11H\x11J\x11üøòøìøüøäøäÞäøìüøìøüøüøìøüøüøìøüøüøìøÜìøìøìøØ\x06\x16h\e\t_\x03U\x08\x01\x16hÌeh0J\x11\x0F\x03j\x16hÌehU\x08\x01\x16hÌeh0J\x10\x16hO\x1FÝ0J\x10\x06\x16hÌeh\x06\x16hO\x1FÝ-\x08\x16\x08.\x08P\x08<U+0082>\x08°\x08Ú\x08R\t¶\tÎ\t:<U+0080>¢Ö&\vF\x04\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x03\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x02\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x01\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x04\x0FgdÌeh\x04\x03gdÌeh\x13PoPHP, C++ XHTML, CSS Microsoft Excel Microsoft Word Vodi preukaz: sk. C (najazdených cca 600 000km) Vlastnosti a záujmy: "

这是英文.doc函数的输出:

"Lorem ipsum Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc ac faucibus odio. Vestibulum neque massa, scelerisque sit amet ligula eu, congue molestie mi. Praesent ut varius sem. Nullam at porttitor arcu, nec lacinia nisi. Ut ac dolor vitae odio interdum condimentum. Vivamus dapibus sodales ex, vitae malesuada ipsum cursus convallis. Maecenas sed egestas nulla, ac condimentum orci. Mauris diam felis, vulputate ac suscipit et, iaculis non est. Curabitur semper arcu ac ligula semper, nec luctus nisl blandit. Integer lacinia ante ac libero lobortis imperdiet. Nullam mollis convallis ipsum, ac accumsan nunc vehicula vitae. Nulla eget justo in felis tristique fringilla. Morbi sit amet tortor quis risus auctor condimentum. Morbi in ullamcorper elit. Nulla iaculis tellus sit amet mauris tempus fringilla.Maecenas mauris lectus, lobortis et purus mattis, blandit dictum tellus.Maecenas non lorem quis tellus placerat varius. Nulla facilisi. Aenean congue fringilla justo ut aliquam. In non mauris justo. Duis vehicula mi vel mi pretium, a viverra erat efficitur. Cras aliquam est ac eros varius, id iaculis dui auctor. Duis pretium neque ligula, et pulvinar mi placerat et. 

我试过谷歌和问我的朋友。我也尝试https://github.com/neitanod/forceutf8。我需要一些想法应该是什么问题或如何解决它。

php laravel ms-word octobercms phpword
© www.soinside.com 2019 - 2024. All rights reserved.