1
Using Markdown as Python Library
2
================================
3
4
First and foremost, Python-Markdown is intended to be a python library module
5
used by various projects to convert Markdown syntax into HTML.
6
7
The Basics
8
----------
9
10
To use markdown as a module:
11
12
    import markdown
13
    html = markdown.markdown(your_text_string)
14
15
Encoded Text
16
------------
17
18
Note that ``markdown()`` expects **Unicode** as input (although a simple ASCII 
19
string should work) and returns output as Unicode.  Do not pass encoded strings to it!
20
If your input is encoded, e.g. as UTF-8, it is your responsibility to decode 
21
it.  E.g.:
22
23
    input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
24
    text = input_file.read()
25
    html = markdown.markdown(text, extensions)
26
27
If you later want to write it to disk, you should encode it yourself:
28
29
    output_file = codecs.open("some_file.html", "w", encoding="utf-8")
30
    output_file.write(html)
31
32
More Options
33
------------
34
35
If you want to pass more options, you can create an instance of the ``Markdown``
36
class yourself and then use ``convert()`` to generate HTML:
37
38
    import markdown
39
    md = markdown.Markdown(
40
            extensions=['footnotes'], 
41
            extension_configs= {'footnotes' : ('PLACE_MARKER','~~~~~~~~')},
42
            output_format='html4',
43
            safe_mode="replace",
44
            html_replacement_text="--NO HTML ALLOWED--",
45
            tab_length=8,
46
            enable_attributes=False,
47
            smart_emphasis=False,
48
    )
49
    return md.convert(some_text)
50
51
You should also use this method if you want to process multiple strings:
52
53
    md = markdown.Markdown()
54
    html1 = md.convert(text1)
55
    html2 = md.convert(text2)
56
57
Any options accepted by the `Markdown` class are also accepted by the 
58
`markdown` shortcut function. However, a new instant of the class will be
59
created each time the shortcut function is called.
60
61
Working with Files
62
------------------
63
64
While the Markdown class is only intended to work with Unicode text, some
65
encoding/decoding is required for the command line features. These functions 
66
and methods are only intended to fit the common use case.
67
68
The ``Markdown`` class has the method ``convertFile`` which reads in a file and
69
writes out to a file-like-object:
70
71
    md = markdown.Markdown()
72
    md.convertFile(input="in.txt", output="out.html", encoding="utf-8")
73
74
The markdown module also includes a shortcut function ``markdownFromFile`` that
75
wraps the above method.
76
77
    markdown.markdownFromFile(input="in.txt", 
78
                              output="out.html", 
79
                              extensions=[],
80
                              encoding="utf-8",
81
                              safe=False)
82
83
In either case, if the ``output`` keyword is passed a file name (i.e.: 
84
``output="out.html"``), it will try to write to a file by that name. If
85
``output`` is passed a file-like-object (i.e. ``output=StringIO.StringIO()``),
86
it will attempt to write out to that object. Finally, if ``output`` is 
87
set to ``None``, it will write to ``stdout``.
88
89
Using Extensions
90
----------------
91
92
One of the parameters that you can pass is a list of Extensions. Extensions 
93
must be available as python modules either within the ``markdown.extensions``
94
package or on your PYTHONPATH with names starting with `mdx_`, followed by the 
95
name of the extension.  Thus, ``extensions=['footnotes']`` will first look for 
96
the module ``markdown.extensions.footnotes``, then a module named 
97
``mdx_footnotes``.   See the documentation specific to the extension you are 
98
using for help in specifying configuration settings for that extension.
99
100
Note that some extensions may need their state reset between each call to 
101
``convert``:
102
103
    html1 = md.convert(text1)
104
    md.reset()
105
    html2 = md.convert(text2)
106
107
Safe Mode
108
---------
109
110
If you are using Markdown on a web system which will transform text provided 
111
by untrusted users, you may want to use the "safe_mode" option which ensures 
112
that the user's HTML tags are either replaced, removed or escaped. (They can 
113
still create links using Markdown syntax.)
114
115
* To replace HTML, set ``safe_mode="replace"`` (``safe_mode=True`` still works 
116
    for backward compatibility with older versions). The HTML will be replaced 
117
    with the text assigned to ``html_replacement_text`` which defaults to 
118
    ``[HTML_REMOVED]``. To replace the HTML with something else:
119
120
        md = markdown.Markdown(safe_mode="replace", 
121
                               html_replacement_text="--RAW HTML NOT ALLOWED--")
122
123
* To remove HTML, set ``safe_mode="remove"``. Any raw HTML will be completely 
124
    stripped from the text with no warning to the author.
125
126
* To escape HTML, set ``safe_mode="escape"``. The HTML will be escaped and 
127
    included in the document.
128
129
Note that "safe_mode" does not alter the "enable_attributes" option, which 
130
could allow someone to inject javascript (i.e., `{@onclick=alert(1)}`). You 
131
may also want to set `enable_attributes=False` when using "safe_mode".
132
133
Output Formats
134
--------------
135
136
If Markdown is outputing (X)HTML as part of a web page, most likely you will
137
want the output to match the (X)HTML version used by the rest of your page/site.
138
Currently, Markdown offers two output formats out of the box; "HTML4" and 
139
"XHTML1" (the default) . Markdown will also accept the formats "HTML" and 
140
"XHTML" which currently map to "HTML4" and "XHTML" respectively. However, 
141
you should use the more explicit keys as the general keys may change in the 
142
future if it makes sense at that time. The keys can either be lowercase or 
143
uppercase.
144
145
To set the output format do:
146
147
    html = markdown.markdown(text, output_format='html4')
148
149
Or, when using the Markdown class:
150
151
    md = markdown.Markdown(output_format='html4')
152
    html = md.convert(text)
153
154
Note that the output format is only set once for the class and cannot be 
155
specified each time ``convert()`` is called. If you really must change the
156
output format for the class, you can use the ``set_output_format`` method:
157
158
    md.set_output_format('xhtml1')