如何从维基百科api获得前100个字符

问题描述 投票:0回答:3

我想从Wikipedia API查询中检索前100个文本字符。

我在谷歌和Stack Overflow上搜索了很多,但没有得到答案。通过搜索我得到了所有的文字内容,但我只需要前100个字符。

这是我的代码的工作片段:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>


<div id="article"></div>

<script type="text/javascript">

    
    $(document).ready(function(){

	$.ajax({
	    type: "GET",
	    url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text&section=0&page=Jimi_Hendrix&callback=?",
	    contentType: "application/json; charset=utf-8",
	    async: false,
	    dataType: "json",
	    success: function (data, textStatus, jqXHR) {
	    
		var markup = data.parse.text["*"];
		var i = $('<div></div>').html(markup);
		
		// remove links as they will not work
		i.find('a').each(function() { $(this).replaceWith($(this).html()); });
		
		// remove any references
		i.find('sup').remove();
		
		// remove cite error
		i.find('.mw-ext-cite-error').remove();
		
		$('#article').html($(i).find('p'));
			
		
	    },
	    error: function (errorMessage) {
	    }
	});    
    
    });
    
	
    
</script>
javascript jquery mediawiki wikipedia wikipedia-api
3个回答
4
投票

你尝试过使用substring / slice吗?

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>


<div id="article"></div>

<script type="text/javascript">

    
    $(document).ready(function(){

	$.ajax({
	    type: "GET",
	    url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text&section=0&page=Jimi_Hendrix&callback=?",
	    contentType: "application/json; charset=utf-8",
	    async: false,
	    dataType: "json",
	    success: function (data, textStatus, jqXHR) {
	    
		var markup = data.parse.text["*"];
		var i = $('<div></div>').html(markup);
		
		// remove links as they will not work
		i.find('a').each(function() { $(this).replaceWith($(this).html()); });
		
		// remove any references
		i.find('sup').remove();
		
		// remove cite error
		i.find('.mw-ext-cite-error').remove();
		
		$('#article').html($(i).find('p').text().slice(0, 100));
			
		
	    },
	    error: function (errorMessage) {
	    }
	});    
    
    });
    
	
    
</script>

3
投票

你的问题与维基百科无关,但你可以使用substring()获得第一个n字符,即

"one two three four".substring(0, 8)
-> "one two "

在您的情况下,这将是:

i.substring(0, 100)

1
投票

因为,我们只需要维基页面中文本内容的100个字符,我们可以迭代段落,直到我们得到至少100个字符,然后使用方法slice检索前100个字符。

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>


<div id="article"></div>

<script type="text/javascript">
   
    
  $(document).ready(function(){
  // extracting 100 length text content from stackoverflow page
	$.ajax({
	    type: "GET",
	    url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text&section=0&page=Stack_Overflow&callback=?", 
	    contentType: "application/json; charset=utf-8",
	    async: false,
	    dataType: "json",
	    success: function (data, textStatus, jqXHR) {
	    
		var markup = data.parse.text["*"];
		var i = $('<div></div>').html(markup);
		
		// remove links as they will not work
		i.find('a').each(function() { $(this).replaceWith($(this).html()); });
		
		// remove any references
		i.find('sup').remove();
		
		// remove cite error
		i.find('.mw-ext-cite-error').remove();
    
         // whole paragraphs
		 var paragraphs = $(i).find('p');
    
         // convert whole paragraphs to string
         var str = "";
         for (var i = 0; i < paragraphs.length; ++i) {
            str += paragraphs[i].textContent;
            // break as soon as we get required length
            if (str.length >= 100 ) break; 
         }
		 $('#article').html(str.slice(0,100));
			
	   },
	    error: function (errorMessage) {
	  }
	});    
    
    });
    
</script>
© www.soinside.com 2019 - 2024. All rights reserved.