Tuesday, October 27, 2009

Recipe 11.9. Creating and Modifying XML Documents










Recipe 11.9. Creating and Modifying XML Documents







Problem


You want to modify an XML document, or create a new one from scratch.




Solution


To create an XML document from scratch, just start with an empty Document object.



require 'rexml/document'
require
doc = REXML::Document.new



To add a new element to an existing document, pass its name and any attributes into its parent's add_element method. You don't have to create the Element objects yourself.



meeting = doc.add_element 'meeting'
meeting_start = Time.local(2006, 10, 31, 13)
meeting.add_element('time', { 'from' => meeting_start,
'to' => meeting_start + 3600 })

doc.children[0] # => <meeting> … </>
doc.children[0].children[0]
# => "<time from='Tue Oct 31 13:00:00 EST 2006'
# to='Tue Oct 31 14:00:00 EST 2006'/>"

doc.write($stdout, 1)
# <meeting>
# <time from='Tue Oct 31 13:00:00 EST 2006'
# to='Tue Oct 31 14:00:00 EST 2006'/>
# </meeting>
doc.children[0] # => <?xml … ?>
doc.children[1] # => <meeting> … </>



To append a text node to the contents of an element, use the
add_text
method. This code adds an <agenda> element to the <meeting> element, and gives it two different text nodes:



agenda = meeting.add_element 'agenda'
doc.children[1].children[1] # => <agenda/>

agenda.
add_text "Nothing of importance will be decided."
agenda.add_text " The same tired ideas will be rehashed yet again."

doc.children[1].children[1] # => <agenda> … </>

doc.write($stdout, 1)
# <meeting>
# <time from='Tue Oct 31 13:00:00 EST 2006'
# to='Tue Oct 31 14:00:00 EST 2006'/>
# <agenda>
# Nothing of importance will be decided. The same tired ideas will be
# rehashed yet again.
# </agenda>
# </meeting>




Element#text=
is a nice shortcut for giving an element a single text node. You can also use to overwrite a document's initial text nodes:



item1 = agenda.add_element 'item'
doc.children[1].children[1].children[1] # => <item/>
item1.text = 'Weekly status meetings: improving attendance'
doc.children[1].children[1].children[1] # => <item> … </>
doc.write($stdout, 1)
# <meeting>
# <time from='Tue Oct 31 13:00:00 EST 2006'
# to='Tue Oct 31 14:00:00 EST 2006'/>
# <agenda>
# Nothing of importance will be decided. The same tired ideas will be
# rehashed yet again.
# <item>Weekly status meetings: improving attendance</item>
# </agenda>
# </meeting>





Discussion


If you can access an element or text node (numerically or with XPath), you can modify or delete it. You can modify an element's name with name=, and modify one of its attributes by assigning to an index of attributes. This code uses these methods to make major changes to a document:



doc = REXML::Document.new %{<?xml version='1.0'?>
<girl size="little">
<foods>
<sugar />
<spice />
</foods>
<set of="nice things" cardinality="all" />
</girl>
}

root = doc[1] # => <girl size='little'> … </>
root.name = 'boy'

root.elements['//sugar'].name = 'snails'
root.delete_element('//spice')

set = root.elements['//set']
set.attributes["of"] = "snips"
set.attributes["cardinality"] = 'some'

root.add_element('set', {'of' => 'puppy dog tails', 'cardinality' => 'some' })
doc.write
# <?xml version='1.0'?>
# <boy size='little'>
# <foods>
# <snails/>
#
# </foods>
# <set of='snips' cardinality='some'/>
# <set of='puppy dog tails' cardinality='some'/></boy>



You can delete an attribute with
Element#delete_attribute
, or by assigning nil to it:



root.attributes['size'] = nil
doc.write($stdout, 0)
# <?xml version='1.0'?>
# <boy>
# <foods>
# …
# </boy>



You can use methods like replace_with to swap out one node for another:



doc.elements["//snails"].replace_with(REXML::Element.new("escargot"))



All these methods are convenient, but add_element in particular is not very idiomatic. The cgi library lets you structure method calls and code blocks so that your Ruby code has the same nesting structure as the HTML it generates. Why shouldn't you be able to do the same for XML? Here's a new method for Element that makes it possible:



class REXML::Element
def with_element(*args)
e = add_element(*args)
yield e if block_given?
end
end



Now you can structure your Ruby code the same way you structure your XML:



doc = REXML::Document.new
doc.with_element('girl', {'size' => 'little'}) do |girl|
girl.with_element('foods') do |foods|
foods.add_element('sugar')
foods.add_element('spice')
end
girl.add_element('set', {'of' => 'nice things', 'cardinality' => 'all'})
end

doc.write($stdout, 0)
# <girl size='little'>
# <foods>
# <sugar/>
# <spice/>
# </foods>
# <set of='nice things' cardinality='all'/>
# </girl>



The builder gem also lets you build XML this way.




See Also


  • Recipe 7.10, "Hiding Setup and Cleanup in a Block Method," has an example of using the XmlMarkup class in the builder gem.













No comments:

Post a Comment