home
lars.st0ne.at
Linux And Related Stuff
[BLOG] |ARCHIVE| |TAG MAP| |OTHER STUFF| |ME|
  • prevent decoding of html entities in textareas


    #html #php #python #regex

    preface: As far as i know, it is not possible to prevent the decoding of html entities in textaresas! The article describes my workaround for that issue.

    what for: In my case, i use a <textarea> as basic html editor. Raw html snippets are stored in the database and are loaded into a <textarea>.

    But without modification, the raw html ...

    <span>&quot;usage: %s &lt;argument1&gt; &lt;argument2&gt;&quot;</span>
    

    ... is displayed as ...

    <span>"usage: %s <argument1> <argument2>"</span>
    

    ... in the <textarea>. If you would store that back to the database, all html entities are lost.

    workaround: I use a single regular expression to "prevent" the decoding.

    Python example code:

    #!/usr/bin/python
    import re
    raw_html = '<span>&quot;usage: %s &lt;argument1&gt; &lt;argument2&gt;&quot;</span>'
    to_textarea = re.sub('&([#\w]+;)', '&amp;\g<1>', raw_html )
    print to_textarea
    

    PHP example code:

    <?php
    $raw_html = '<span>&quot;usage: %s &lt;argument1&gt; &lt;argument2&gt;&quot;</span>';
    $to_textarea = preg_replace('/&([#\w]+;)/', '&amp;$1', $raw_html );
    echo $to_textarea;
    ?>
    

    The scripts above are replacing the "&" sign in every html entitie code with "&amp;", and returning the following string:

    <span>&amp;quot;usage: %s &amp;lt;argument1&amp;gt; &amp;lt;argument2&amp;gt;&amp;quot;</span>
    

    ... If this string is loaded to the <textarea> rather than the raw one, you can store it back to database without destroying your content.


    Aug 01 2012 20:25
    by st0ne
    ? hits
    • comment @Twitter
    • share on Twitter
©2012 Robert Steininger aka st0ne
CC-BY-SA