1
/** @mainpage
2
3
<h1> TinyXml </h1>
4
5
TinyXml is a simple, small, C++ XML parser that can be easily 
6
integrating into other programs.
7
8
<h2> What it does. </h2>
9
	
10
In brief, TinyXml parses an XML document, and builds from that a 
11
Document Object Model (DOM) that can be read, modified, and saved.
12
13
XML stands for "eXtensible Markup Language." It allows you to create 
14
your own document markups. Where HTML does a very good job of marking 
15
documents for browsers, XML allows you to define any kind of document 
16
markup, for example a document that describes a "to do" list for an 
17
organizer application. XML is a very structured and convenient format.
18
All those random file formats created to store application data can 
19
all be replaced with XML. One parser for everything.
20
21
The best place for the complete, correct, and quite frankly hard to
22
read spec is at <a href="http://www.w3.org/TR/2004/REC-xml-20040204/">
23
http://www.w3.org/TR/2004/REC-xml-20040204/</a>. An intro to XML
24
(that I really like) can be found at 
25
<a href="http://skew.org/xml/tutorial/">http://skew.org/xml/tutorial</a>.
26
27
There are different ways to access and interact with XML data.
28
TinyXml uses a Document Object Model (DOM), meaning the XML data is parsed
29
into a C++ objects that can be browsed and manipulated, and then 
30
written to disk or another output stream. You can also construct an XML document from
31
scratch with C++ objects and write this to disk or another output
32
stream.
33
34
TinyXml is designed to be easy and fast to learn. It is two headers 
35
and four cpp files. Simply add these to your project and off you go. 
36
There is an example file - xmltest.cpp - to get you started. 
37
38
TinyXml is released under the ZLib license, 
39
so you can use it in open source or commercial code. The details
40
of the license are at the top of every source file.
41
42
TinyXml attempts to be a flexible parser, but with truly correct and
43
compliant XML output. TinyXml should compile on any reasonably C++
44
compliant system. It does not rely on exceptions or RTTI. It can be 
45
compiled with or without STL support. TinyXml fully supports
46
the UTF-8 encoding, and the first 64k character entities.
47
48
49
<h2> What it doesn't do. </h2>
50
51
It doesnt parse or use DTDs (Document Type Definitions) or XSLs
52
(eXtensible Stylesheet Language.) There are other parsers out there 
53
(check out www.sourceforge.org, search for XML) that are much more fully
54
featured. But they are also much bigger, take longer to set up in
55
your project, have a higher learning curve, and often have a more
56
restrictive license. If you are working with browsers or have more
57
complete XML needs, TinyXml is not the parser for you.
58
59
The following DTD syntax will not parse at this time in TinyXml:
60
61
@verbatim
62
	<!DOCTYPE Archiv [
63
	 <!ELEMENT Comment (#PCDATA)>
64
	]>
65
@endverbatim
66
67
because TinyXml sees this as a !DOCTYPE node with an illegally 
68
embedded !ELEMENT node. This may be addressed in the future.
69
70
<h2> Tutorials. </h2>
71
72
For the impatient, here is a tutorial to get you going. A great way to get started, 
73
but it is worth your time to read this (very short) manual completely.
74
75
- @subpage tutorial0
76
77
<h2> Code Status.  </h2>
78
79
TinyXml is mature, tested code. It is very stable. If you find
80
bugs, please file a bug report on the sourceforge web site
81
(www.sourceforge.net/projects/tinyxml).
82
We'll get them straightened out as soon as possible.
83
84
There are some areas of improvement; please check sourceforge if you are
85
interested in working on TinyXml.
86
87
88
<h2> Features </h2>
89
90
<h3> Using STL </h3>
91
92
TinyXml can be compiled to use or not use STL. When using STL, TinyXml
93
uses the std::string class, and fully supports std::istream, std::ostream,
94
operator<<, and operator>>. Many API methods have both 'const char*' and
95
'const std::string&' forms.
96
97
When STL support is compiled out, no STL files are included whatsover. All
98
the string classes are implemented by TinyXml itself. API methods
99
all use the 'const char*' form for input.
100
101
Use the compile time #define:
102
103
	TIXML_USE_STL
104
105
to compile one version or the other. This can be passed by the compiler,
106
or set as the first line of "tinyxml.h".
107
108
Note: If compiling the test code in Linux, setting the environment
109
variable TINYXML_USE_STL=YES/NO will control STL compilation. In the
110
Windows project file, STL and non STL targets are provided. In your project,
111
its probably easiest to add the line "#define TIXML_USE_STL" as the first
112
line of tinyxml.h.
113
114
<h3> UTF-8 </h3>
115
116
TinyXml supports UTF-8 allowing to manipulate XML files in any language. TinyXml
117
also supports "legacy mode" - the encoding used before UTF-8 support and
118
probably best described as "extended ascii".
119
120
Normally, TinyXml will try to detect the correct encoding and use it. However,
121
by setting the value of TIXML_DEFAULT_ENCODING in the header file, TinyXml
122
can be forced to always use one encoding.
123
124
TinyXml will assume Legacy Mode until one of the following occurs:
125
<ol>
126
	<li> If the non-standard but common "UTF-8 lead bytes" (0xef 0xbb 0xbf)
127
		 begin the file or data stream, TinyXml will read it as UTF-8. </li>
128
	<li> If the declaration tag is read, and it has an encoding="UTF-8", then
129
		 TinyXml will read it as UTF-8. </li>
130
	<li> If the declaration tag is read, and it has no encoding specified, then
131
		 TinyXml will read it as UTF-8. </li>
132
	<li> If the declaration tag is read, and it has an encoding="something else", then
133
		 TinyXml will read it as Legacy Mode. In legacy mode, TinyXml will 
134
		 work as it did before. It's not clear what that mode does exactly, but 
135
		 old content should keep working.</li>
136
	<li> Until one of the above criteria is met, TinyXml runs in Legacy Mode.</li>
137
</ol>
138
139
What happens if the encoding is incorrectly set or detected? TinyXml will try
140
to read and pass through text seen as improperly encoded. You may get some strange
141
results or mangled characters. You may want to force TinyXml to the correct mode.
142
143
<b> You may force TinyXml to Legacy Mode by using LoadFile( TIXML_ENCODING_LEGACY ) or
144
LoadFile( filename, TIXML_ENCODING_LEGACY ). You may force it to use legacy mode all
145
the time by setting TIXML_DEFAULT_ENCODING = TIXML_ENCODING_LEGACY. Likewise, you may 
146
force it to TIXML_ENCODING_UTF8 with the same technique.</b>
147
148
For English users, using English XML, UTF-8 is the same as low-ASCII. You
149
don't need to be aware of UTF-8 or change your code in any way. You can think
150
of UTF-8 as a "superset" of ASCII.
151
152
UTF-8 is not a double byte format - but it is a standard encoding of Unicode!
153
TinyXml does not use or directly support wchar, TCHAR, or Microsofts _UNICODE at this time. 
154
It is common to see the term "Unicode" improperly refer to UTF-16, a wide byte encoding
155
of unicode. This is a source of confusion.
156
157
For "high-ascii" languages - everything not English, pretty much - TinyXml can
158
handle all languages, at the same time, as long as the XML is encoded
159
in UTF-8. That can be a little tricky, older programs and operating systems
160
tend to use the "default" or "traditional" code page. Many apps (and almost all
161
modern ones) can output UTF-8, but older or stubborn (or just broken) ones
162
still output text in the default code page. 
163
164
For example, Japanese systems traditionally use SHIFT-JIS encoding. 
165
Text encoded as SHIFT-JIS can not be read by tinyxml. 
166
A good text editor can import SHIFT-JIS and then save as UTF-8.
167
168
The <a href="http://skew.org/xml/tutorial/">Skew.org link</a> does a great
169
job covering the encoding issue.
170
171
The test file "utf8test.xml" is an XML containing English, Spanish, Russian,
172
and Simplified Chinese. (Hopefully they are translated correctly). The file
173
"utf8test.gif" is a screen capture of the XML file, rendered in IE. Note that
174
if you don't have the correct fonts (Simplified Chinese or Russian) on your
175
system, you won't see output that matches the GIF file even if you can parse
176
it correctly. Also note that (at least on my Windows machine) console output
177
is in a Western code page, so that Print() or printf() cannot correctly display
178
the file. This is not a bug in TinyXml - just an OS issue. No data is lost or 
179
destroyed by TinyXml. The console just doesn't render UTF-8.
180
181
182
<h3> Entities </h3>
183
TinyXml recognizes the pre-defined "character entities", meaning special
184
characters. Namely:
185
186
@verbatim
187
	&amp;	&
188
	&lt;	<
189
	&gt;	>
190
	&quot;	"
191
	&apos;	'
192
@endverbatim
193
194
These are recognized when the XML document is read, and translated to there
195
UTF-8 equivalents. For instance, text with the XML of:
196
197
@verbatim
198
	Far &amp; Away
199
@endverbatim
200
201
will have the Value() of "Far & Away" when queried from the TiXmlText object,
202
and will be written back to the XML stream/file as an ampersand. Older versions
203
of TinyXml "preserved" character entities, but the newer versions will translate
204
them into characters.
205
206
Additionally, any character can be specified by its Unicode code point:
207
The syntax "&#xA0;" or "&#160;" are both to the non-breaking space characher.
208
209
210
<h3> Streams </h3>
211
With TIXML_USE_STL on,
212
TiXml has been modified to support both C (FILE) and C++ (operator <<,>>) 
213
streams. There are some differences that you may need to be aware of.
214
215
C style output:
216
	- based on FILE*
217
	- the Print() and SaveFile() methods
218
219
	Generates formatted output, with plenty of white space, intended to be as 
220
	human-readable as possible. They are very fast, and tolerant of ill formed 
221
	XML documents. For example, an XML document that contains 2 root elements 
222
	and 2 declarations, will still print.
223
224
C style input:
225
	- based on FILE*
226
	- the Parse() and LoadFile() methods
227
228
	A fast, tolerant read. Use whenever you don't need the C++ streams.
229
230
C++ style ouput:
231
	- based on std::ostream
232
	- operator<<
233
234
	Generates condensed output, intended for network transmission rather than
235
	readability. Depending on your system's implementation of the ostream class,
236
	these may be somewhat slower. (Or may not.) Not tolerant of ill formed XML:
237
	a document should contain the correct one root element. Additional root level
238
	elements will not be streamed out.
239
240
C++ style input:
241
	- based on std::istream
242
	- operator>>
243
244
	Reads XML from a stream, making it useful for network transmission. The tricky
245
	part is knowing when the XML document is complete, since there will almost
246
	certainly be other data in the stream. TinyXml will assume the XML data is
247
	complete after it reads the root element. Put another way, documents that
248
	are ill-constructed with more than one root element will not read correctly.
249
	Also note that operator>> is somewhat slower than Parse, due to both 
250
	implementation of the STL and limitations of TinyXml.
251
252
<h3> White space </h3>
253
The world simply does not agree on whether white space should be kept, or condensed.
254
For example, pretend the '_' is a space, and look at "Hello____world". HTML, and 
255
at least some XML parsers, will interpret this as "Hello_world". They condense white
256
space. Some XML parsers do not, and will leave it as "Hello____world". (Remember
257
to keep pretending the _ is a space.) Others suggest that __Hello___world__ should become
258
Hello___world.
259
260
It's an issue that hasn't been resolved to my satisfaction. TinyXml supports the
261
first 2 approaches. Call TiXmlBase::SetCondenseWhiteSpace( bool ) to set the desired behavior.
262
The default is to condense white space.
263
264
If you change the default, you should call TiXmlBase::SetCondenseWhiteSpace( bool )
265
before making any calls to Parse XML data, and I don't recommend changing it after
266
it has been set.
267
268
269
<h3> Handles </h3>
270
271
Where browsing an XML document in a robust way, it is important to check
272
for null returns from method calls. An error safe implementation can
273
generate a lot of code like:
274
275
@verbatim
276
TiXmlElement* root = document.FirstChildElement( "Document" );
277
if ( root )
278
{
279
	TiXmlElement* element = root->FirstChildElement( "Element" );
280
	if ( element )
281
	{
282
		TiXmlElement* child = element->FirstChildElement( "Child" );
283
		if ( child )
284
		{
285
			TiXmlElement* child2 = child->NextSiblingElement( "Child" );
286
			if ( child2 )
287
			{
288
				// Finally do something useful.
289
@endverbatim
290
291
Handles have been introduced to clean this up. Using the TiXmlHandle class,
292
the previous code reduces to:
293
294
@verbatim
295
TiXmlHandle docHandle( &document );
296
TiXmlElement* child2 = docHandle.FirstChild( "Document" ).FirstChild( "Element" ).Child( "Child", 1 ).Element();
297
if ( child2 )
298
{
299
	// do something useful
300
@endverbatim
301
302
Which is much easier to deal with. See TiXmlHandle for more information.
303
304
305
<h3> Row and Column tracking </h3>
306
Being able to track nodes and attributes back to their origin location
307
in source files can be very important for some applications. Additionally,
308
knowing where parsing errors occured in the original source can be very
309
time saving.
310
311
TinyXml can tracks the row and column origin of all nodes and attributes
312
in a text file. The TiXmlBase::Row() and TiXmlBase::Column() methods return
313
the origin of the node in the source text. The correct tabs can be 
314
configured in TiXmlDocument::SetTabSize().
315
316
317
<h2> Using and Installing </h2>
318
319
To Compile and Run xmltest:
320
321
A Linux Makefile and a Windows Visual C++ .dsw file is provided. 
322
Simply compile and run. It will write the file demotest.xml to your 
323
disk and generate output on the screen. It also tests walking the
324
DOM by printing out the number of nodes found using different 
325
techniques.
326
327
The Linux makefile is very generic and will
328
probably run on other systems, but is only tested on Linux. You no
329
longer need to run 'make depend'. The dependecies have been
330
hard coded.
331
332
<h3>Windows project file for VC6</h3>
333
<ul>
334
<li>tinyxml:		tinyxml library, non-STL </li>
335
<li>tinyxmlSTL:		tinyxml library, STL </li>
336
<li>tinyXmlTest:	test app, non-STL </li>
337
<li>tinyXmlTestSTL: test app, STL </li>
338
</ul>
339
340
<h3>Linux Make file</h3>
341
At the top of the makefile you can set:
342
343
PROFILE, DEBUG, and TINYXML_USE_STL. Details (such that they are) are in
344
the makefile.
345
346
In the tinyxml directory, type "make clean" then "make". The executable
347
file 'xmltest' will be created.
348
349
350
351
<h3>To Use in an Application:</h3>
352
353
Add tinyxml.cpp, tinyxml.h, tinyxmlerror.cpp, tinyxmlparser.cpp, tinystr.cpp, and tinystr.h to your
354
project or make file. That's it! It should compile on any reasonably
355
compliant C++ system. You do not need to enable exceptions or
356
RTTI for TinyXml.
357
358
359
<h2> How TinyXml works.  </h2>
360
361
An example is probably the best way to go. Take:
362
@verbatim
363
	<?xml version="1.0" standalone=no>
364
	<!-- Our to do list data -->
365
	<ToDo>
366
		<Item priority="1"> Go to the <bold>Toy store!</bold></Item>
367
		<Item priority="2"> Do bills</Item>
368
	</ToDo>
369
@endverbatim
370
371
Its not much of a To Do list, but it will do. To read this file 
372
(say "demo.xml") you would create a document, and parse it in:
373
@verbatim
374
	TiXmlDocument doc( "demo.xml" );
375
	doc.LoadFile();
376
@endverbatim
377
378
And its ready to go. Now lets look at some lines and how they 
379
relate to the DOM.
380
381
@verbatim
382
<?xml version="1.0" standalone=no>
383
@endverbatim
384
385
	The first line is a declaration, and gets turned into the
386
	TiXmlDeclaration class. It will be the first child of the
387
	document node.
388
	
389
	This is the only directive/special tag parsed by by TinyXml.
390
	Generally directive targs are stored in TiXmlUnknown so the 
391
	commands wont be lost when it is saved back to disk.
392
393
@verbatim
394
<!-- Our to do list data -->
395
@endverbatim
396
397
	A comment. Will become a TiXmlComment object.
398
399
@verbatim
400
<ToDo>
401
@endverbatim
402
403
	The "ToDo" tag defines a TiXmlElement object. This one does not have 
404
	any attributes, but does contain 2 other elements.
405
406
@verbatim
407
<Item priority="1"> 
408
@endverbatim
409
410
	Creates another TiXmlElement which is a child of the "ToDo" element. 
411
	This element has 1 attribute, with the name "priority" and the value 
412
	"1".
413
414
Go to the 
415
416
	A TiXmlText. This is a leaf node and cannot contain other nodes. 
417
	It is a child of the "Item" TiXmlElement.
418
419
@verbatim
420
<bold>
421
@endverbatim
422
423
	
424
	Another TiXmlElement, this one a child of the "Item" element.
425
426
Etc.
427
428
Looking at the entire object tree, you end up with:
429
@verbatim
430
TiXmlDocument				"demo.xml"
431
	TiXmlDeclaration		"version='1.0'" "standalone=no"
432
	TiXmlComment			" Our to do list data"
433
	TiXmlElement			"ToDo"
434
		TiXmlElement		"Item"		Attribtutes: priority = 1
435
			TiXmlText		"Go to the "
436
			TiXmlElement    "bold"
437
				TiXmlText	"Toy store!"
438
		TiXmlElement			"Item"		Attributes: priority=2
439
			TiXmlText			"Do bills"
440
@endverbatim
441
442
<h2> Documentation </h2>
443
444
The documentation is build with Doxygen, using the 'dox' 
445
configuration file.
446
447
<h2> License </h2>
448
449
TinyXml is released under the zlib license:
450
451
This software is provided 'as-is', without any express or implied 
452
warranty. In no event will the authors be held liable for any 
453
damages arising from the use of this software.
454
455
Permission is granted to anyone to use this software for any 
456
purpose, including commercial applications, and to alter it and 
457
redistribute it freely, subject to the following restrictions:
458
459
1. The origin of this software must not be misrepresented; you must 
460
not claim that you wrote the original software. If you use this 
461
software in a product, an acknowledgment in the product documentation 
462
would be appreciated but is not required.
463
464
2. Altered source versions must be plainly marked as such, and 
465
must not be misrepresented as being the original software.
466
467
3. This notice may not be removed or altered from any source 
468
distribution.
469
470
<h2> References  </h2>
471
472
The World Wide Web Consortium is the definitive standard body for 
473
XML, and there web pages contain huge amounts of information. 
474
475
The definitive spec: <a href="http://www.w3.org/TR/2004/REC-xml-20040204/">
476
http://www.w3.org/TR/2004/REC-xml-20040204/</a>
477
478
I also recommend "XML Pocket Reference" by Robert Eckstein and published by 
479
OReilly...the book that got the whole thing started.
480
481
<h2> Contributors, Contacts, and a Brief History </h2>
482
483
Thanks very much to everyone who sends suggestions, bugs, ideas, and 
484
encouragement. It all helps, and makes this project fun. A special thanks
485
to the contributors on the web pages that keep it lively.
486
487
So many people have sent in bugs and ideas, that rather than list here 
488
we try to give credit due in the "changes.txt" file.
489
490
TinyXml was originally written be Lee Thomason. (Often the "I" still
491
in the documenation.) Lee reviews changes and releases new versions,
492
with the help of Yves Berquin and the tinyXml community.
493
494
We appreciate your suggestions, and would love to know if you 
495
use TinyXml. Hopefully you will enjoy it and find it useful. 
496
Please post questions, comments, file bugs, or contact us at:
497
498
www.sourceforge.net/projects/tinyxml
499
500
Lee Thomason,
501
Yves Berquin
502
*/