/*
htmLawed_TESTCASE.txt, 11 February 2017
To test htmLawed
Copyright Santosh Patnaik
Dual licensed with LGPL 3 and GPL 2+
A PHP Labware internal utility - www.bioinformatics.org/phplabware/internal_utilities/htmLawed
*/
This file has UTF-8-encoded text with both correct and incorrect/malformed HTML/XHTML code snippets to test htmLawed (test cases/samples). The entire text may also be used as a unit.
************************************************
when viewing this file in a web browser, set the
character encoding to Unicode/UTF-8
************************************************
--------------------- start --------------------
Try different $config and $spec values. Some text even when filtered in will not be displayed in a rendered web-page
Attributes
Xml:lang:, , Standard, predefined value, or empty attribute: , , Required: , Quote & space variation:a, a, a Invalid:a Duplicated:a Deprecated:a, Casing: Custom: Data-*:a Admin-restricted?:
Attribute values
Duplicate ID value:, ,
(try 'my_' for prefix) Double-quotes in value:, ,
(try filter for CSS expression) CSS expression: Other: ,
(try 'maxlen', 'maxval', etc., for 'input' in '$spec')
Malformed: < a href=""></a>, , , , < /a>, < a href="">, , , <imgsrc="s" alt="a" /> Invalid: <image src="s" alt="a" /> Empty: , </img>, text</img> Content invalid:12</a> Content invalid?: (try setting 'form' as parent) Casing: Check for tidy: </div></div></div>
hi
Entities
Special: & 3 < 2 & 5>4 and j >i >a & i<j>a Padding: B B f f   Malformed: & #x27;, &x27;, ' &TILDE;, &tilde Invalid: , �, , �, , &bad; Discouraged characters: , „, , Context: '>', <? Casing: ', ', &TILDE;, ˜
(also check named-to-numeric and hexdec-to-decimal, and vice versa, conversions)
Format
Valid but ill-formatted: text <!-- comment -->
text <!--
A c o m m e n t -->
<script>
<![CDATA[
code
]]>
</script><!-- comment --><![CDATA[ cdata ]]> text</b> text
Inscrieţi-vă acum la a Zecea Conferinţă Internaţională
გთხოვთ ახლავე გაიაროთ რეგისტრაცია
večjezično računalništvo อ.อ่าง Зарегистрируйтесь сейчас
на Десятую Международную Конференцию по
(this file should have utf-8 encoding; some characters may not be displayed because of missing fonts, etc.)
Non-English text-2: entities
用统一码
გთხოვთ
Inscreva-se agora para a Décima Conferência Internacional Sobre O Unicode, realizada entre os dias 10 e 12 de março de 1997 em Mainz
na Alemanha.
Ruby
(need compatible browser) 斎藤信男 WWW
A
Tables
Omitted closing tags:
h1c1
h1c2
r1c1
r1c2
r2c1
r2c2
Nested, omitted closing tags:
h1c1
h1c2
r1c1
r1c2
h1c1
h1c2
r1c1
r1c2
r2c1
r2c2
r2c1
r2c2
Tag transformation
Font element intended as 'inline' element:
hi
Font element intended as 'block' element:
<div>hi
</span></div> Font element intended as 'block' element:
<div>hi
QQQ
</span></div>
Tidy
White-space handling: abc def ghi abc def ghi
URLs
Relative and absolute:, , , , , ,
(try base URL value of 'http://a.com/b/') CSS URLs: , , , , Double URLs:b Anti-spam: (try regex for 'http://a.com', etc.) , , , , , , , Soft-hyphen:ídisc
XSS
'';!--"<xss>=&{()}
test
<div style="javascript:alert('xss');"></div>
<div style="background-image:url(denied:javascript:alert('xss'));"></div>
<div style="background-image:url("denied:javascript:alert('xss')" );"></div>
<!--[if gte IE 4]><script>alert('xss');</script><![endif]-->
<script a=">" src="http://ha.ckers.org/xss.js"></script>
<div style="background-image: url('denied:js:xss')"></div> test Bad IE7:x Opera:linkBad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:xxx Bad IE7:x Bad IE7:x Bad IE7:x Bad IE7:x Bad IE7: exp/*x Bad IE7:hi Bad IE7:hi Bad IE7:test Bad IE7:hi Bad IE7:hi
<h6>Other</h6>
3 < 4
3 > 4
> 3
<._.> hi!
<<< ALERT >>>
<![if !vml]> some stuff <![endif]>
<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
<uml:ns ns = "urn:www">
<uml:ns ns = 'urn:www'>
if(13<age AND 21>age){say 'teen'}
age >51 and a smoking history of >51 pack-years was
age > 51 and a smoking history of >51 pack-years was
age <51 and a smoking history of <51 pack-years <b>was</b>
age < 51 and a smoking history of < 51 pack-years was age >51 and a smoking history of >51 pack-years age > 51 and a smoking history of >51 pack-years age <51 and a smoking history of <51 pack-years</b> age < 51 and a smoking history of < 51 pack-years