我正在从像这样返回xml的api中获取数据:
<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>
我是反序列化的新手,但我认为合适的是将这个xml解析成一个ruby对象,然后我可以引用像objectFoo.seriess.series.frequency那样返回'Quarterly'。
从我在这里和谷歌的搜索中,似乎没有一个明显的解决方案,这在Ruby(NOT rails),这让我觉得我错过了一些相当明显的东西。有任何想法吗?
编辑我根据Winfield的建议设置了一个测试用例。
class Exopenstruct
require 'ostruct'
def initialize()
hash = {"seriess"=>{"realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "series"=>{"id"=>"GDPC1", "realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "title"=>"Real Gross Domestic Product, 1 Decimal", "observation_start"=>"1947-01-01", "observation_end"=>"2012-10-01", "frequency"=>"Quarterly", "frequency_short"=>"Q", "units"=>"Billions of Chained 2005 Dollars", "units_short"=>"Bil. of Chn. 2005 $", "seasonal_adjustment"=>"Seasonally Adjusted Annual Rate", "seasonal_adjustment_short"=>"SAAR", "last_updated"=>"2013-01-30 07:46:54-06", "popularity"=>"93", "notes"=>"Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States.\n\nFor more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"}}}
object_instance = OpenStruct.new( hash )
end
end
在irb中我加载了rb文件并实例化了该类。但是,当我尝试访问一个属性(例如instance.seriess)时,我收到了:NoMethodError:undefined method`seriess'
如果我遗漏了一些明显的东西,再次道歉。
您可能最好使用标准XML进行散列解析,例如Rails中包含的:
object_hash = Hash.from_xml(xml_string)
puts object_hash['seriess']
如果您不使用Rails堆栈,则可以使用像Nokogiri这样的库来实现相同的行为。
编辑:如果您正在寻找对象行为,使用OpenStruct是一个很好的方式来包装哈希:
object_instance = OpenStruct.new( Hash.from_xml(xml_string) )
puts object_instance.seriess
注意:对于深度嵌套的数据,您可能还需要以递归方式将嵌入的哈希值转换为OpenStruct实例。 IE:如果上面的属性是值的哈希值,则它将是哈希值而不是OpenStruct。
我刚刚开始使用Damien Le Berrigaud's fork of HappyMapper,我真的很高兴。您定义了简单的Ruby类和include HappyMapper
。当你调用parse
时,它使用Nokogiri在XML中啜饮,你会得到一个完整的真实Ruby对象树。
我用它来解析多兆字节的XML文件,发现它快速可靠。看看README。
一个提示:由于XML文件编码字符串有时会出现问题,您可能需要像这样清理XML:
def sanitize(xml)
xml.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '')
end
在将其传递给#parse方法之前,为了避免Nokogiri的Input is not proper UTF-8, indicate encoding !
错误。
我继续将OP的示例转换为HappyMapper:
XML_STRING = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'
class Series; end; # fwd reference
class Seriess
include HappyMapper
tag 'seriess'
attribute :realtime_start, Date
attribute :realtime_end, Date
has_many :seriess, Series, :tag => 'series'
end
class Series
include HappyMapper
tag 'series'
attribute 'id', String
attribute 'realtime_start', Date
attribute 'realtime_end', Date
attribute 'title', String
attribute 'observation_start', Date
attribute 'observation_end', Date
attribute 'frequency', String
attribute 'frequency_short', String
attribute 'units', String
attribute 'units_short', String
attribute 'seasonal_adjustment', String
attribute 'seasonal_adjustment_short', String
attribute 'last_updated', DateTime
attribute 'popularity', Integer
attribute 'notes', String
end
def test
Seriess.parse(XML_STRING, :single => true)
end
这就是你可以用它做的事情:
>> a = test
>> a.class
Seriess
>> a.seriess.first.frequency
=> "Quarterly"
>> a.seriess.first.observation_start
=> #<Date: 1947-01-01 ((2432187j,0s,0n),+0s,2299161j)>
>> a.seriess.first.popularity
=> 93
Nokogiri解决了解析问题。如何处理数据,取决于你,在这里我以OpenStruct
为例:
require 'nokogiri'
require 'ostruct'
require 'open-uri'
doc = Nokogiri.parse open('http://www.w3schools.com/xml/note.xml')
note = OpenStruct.new
note.to = doc.at('to').text
note.from = doc.at('from').text
note.heading = doc.at('heading').text
note.body = doc.at('body').text
=> #<OpenStruct to="Tove", from="Jani", heading="Reminder", body="ToveJaniReminderDon't forget me this weekend!\r\n">
这只是一个预告片,你的问题幅度可能要大很多倍。只是给你一个开始使用的优势
编辑:在谷歌和stackoverflow上遇到困难我遇到了我的答案和使用rails Hash#from_xml
的@Winfield之间的可能混合:
> require 'active_support/core_ext/hash/conversions'
> xml = Nokogiri::XML.parse(open('http://www.w3schools.com/xml/note.xml'))
> Hash.from_xml(xml.to_s)
=> {"note"=>{"to"=>"Tove", "from"=>"Jani", "heading"=>"Reminder", "body"=>"Don't forget me this weekend!"}}
然后你可以使用这个哈希来,例如,初始化一个新的ActiveRecord :: Base模型实例或你决定用它做的任何其他事情。
http://nokogiri.org/ http://ruby-doc.org/stdlib-1.9.3/libdoc/ostruct/rdoc/OpenStruct.html https://stackoverflow.com/a/7488299/1740079
如果你想将xml转换为Hash,我发现nori gem是最简单的。
例:
require 'nori'
xml = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'
hash = Nori.new.parse(xml)
hash['seriess']
hash['seriess']['series']
puts hash['seriess']['series']['@frequency']
注意'@'用于频率,因为它是'series'的属性而不是元素。