NekoHTML html compressing removes spaces between tags

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

NekoHTML html compressing removes spaces between tags

tixe
i've got something like

<span th:text="#{home.counter-now}"></span>&nbsp;<span class="num" th:text="${players_counter}"></span><span th:text="#{home.counter-players}"></span> <span class="num" th:text="${bedollars_counter}"></span> <span th:text="#{home.counter-bedollars}"></span>

it looks like nekohtml compresses everything and removes the spaces
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

tixe
anybody solved this?
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

Emanuel
Administrator
I don't know if anybody can confirm that it's just the way nekoHTML behaves, but I get the same results (whitespace between elements removed) when I switch my project from HTML5 to LEGACYHTML5.  I tried looking for nekoHTML configuration options, but found nothing about dealing with whitespace between elements (only 'normalizing' whitespace in attributes).

I also looked at the output of my HTML documents after going through the Neko parser, and sure enough whitespace was gone from there too.

Are you able to switch to using HTML5 or XHTML template modes?
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

tixe
we'd like to stick to html5 (what you call legacyhtml5)

also i notice that output html gets compressed giving us so many problems (readablity first).

another example is that we have google analytics script at the bottom of our skeleton's body (skeleton is a fragment all pages extend) and for some reason it does not show in our final page
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

Zemi
Administrator
Hi Tixe,

LEGACYHTML5 and HTML5 template modes are not the same.

If you use LEGACYHTML5 your HTML code is tidied up.
If you use HTML5 the structure of your HTML code remain untouched.

> another example is that we have google analytics script at the bottom of our skeleton's body
> (skeleton is a fragment all pages extend) and for some reason it does not show in our final page

I think that's an unrelated problem.

Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

tixe
Hi zemi,

I am aware of the difference.

We'd like HTML5.

Writing something like
<img src="ciao.png">
 is valid HTML5.

If I put HTML5 as templateMode I get an error saying it cannot find the relative closing tag.

It works with LEGACYHTML5 but It gets horribly compressed and spaces between tags disappear and weird things happen when writing stuff like

<script>window.jQuery || document.write('<script src="static/js/vendor/jquery-1.8.3.min.js"><\/script>')</script>
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

Emanuel
Administrator
Given that it's the Neko parser that is removing the whitespace between elements, I think a way to get around this will be to implement a custom template parser (ITemplateParser interface) that uses a parser that can accept HTML5 in the non-XML form.

Looking at the code that goes into the existing parsers, I don't think this will be a simple thing to do, although Thymeleaf does have a bunch of utility methods that might make this job easier.  (I just took a look at JTidy, which accepts non-XML HTML, and it can convert an InputStream to an org.w3c.dom.Document, and one of the Thymeleaf utility classes can then take that and turn it into an org.thymeleaf.dom.Document.  I don't know if JTidy will also strip whitespace between elements though.).
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

tixe
This is really a big deal for us.
We're investing a lot on our new webapps a thought thymeleaf was a perfect fit.

Really wish this'll be fixed somehow.

I wonder if you guys will open up for something different than neko (jtidy or attoparser).

Check out our new website (still sort of beta): www.beintoo.com/v2/
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

Emanuel
Administrator
Maybe you should try implement a custom parser like I suggested.  It requires some work, but it might be what you need to keep going.  I had another look at the existing parsers, and it's not looking as difficult as I first thought: most of the code is for setting-up those DOMBuilder/Factory / SAXParser/Factory objects and error-checking.  For example, if you tried to implement a JTidy parser, what you'll have to do is:

1. Create a new class that implements the ITemplateParser interface.

2. Implement the 2 methods of that interface.  In both methods you'd just pass the input into one of JTidy's parse*() methods.  Some work will be needed to convert the Reader object that Thymeleaf gives you, to an InputStream that JTidy's parse methods accept.

3. Convert the result from the parse method into a Thymeleaf DOM object.  This can be done using one of Thymeleaf's utility classes, org.thymeleaf.util.StandardDOMTranslator, and its translate*() methods.

4. Return the Thymeleaf DOM object.

Lastly, you'll have to create a custom template type (eg: JTIDYHTML5) and make Thymeleaf invoke your parser when it detects that template type.  I haven't found where that happens though, so some source code digging will be required to figure that out.
Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

danielfernandez
Administrator
Hi,

I'm afraid that's a side effect of using nekoHTML in the LEGACYHTML5 template mode.

There are no standards for parsing non-XML-valid HTML markup in Java -- only XML parsing standards exist. And when I developed the current thymeleaf parsing system --which is now one year old--, the only options I had for allowing such type of markup were: 1. Use an HTML-to-XML translator (normally in the form of a tag balancer), or 2. Use the HTML5 parser from validator.nu.

The latter one rejected any non-strictly-HTML5 input with no possibility for configuration, so there was no chance that the "th:*" attributes were accepted. So I went for the first option, tag balancing. I tried several possible choices --including JTidy and TagSoup-- and I first chose "htmlcleaner", but after some versions I decided to switch to "nekoHTML", which is one of the most used libraries for tag balancing.

Is it a perfect solution? no. Far from that, as you are experiencing. But the LEGACYHTML5 mode is mainly meant for integrating fragments of XML-ill-formed code from external and uncontrolled sources like e.g. corporate header/banner servers, so this is normally not a huge issue.

I have many reasons for recommending you to create XML-valid HTML5 code (see http://www.thymeleaf.org/fromhtmltohtmlviahtml.html ), but I understand you will have your reasons for doing so. So if nekoHTML is causing too many problems to you, and you still want to use thymeleaf, you could try with attoparser.

Attoparser [ http://www.attoparser.org ] is a SAX-style (event-based) markup parser I've recently created that is aimed at being the future thymeleaf default parsing system, probably in thymeleaf 3.0. It is fast, it is light, and it does not mind your code to be non-XML-valid if you don't configure it to check such thing.

The latest thymeleaf-extras-conditionalcomments module I released [ http://github.com/thymeleaf/thymeleaf-extras-conditionalcomments ] contains in fact an implementation of thymeleaf's ITemplateParser interface using attoparser: https://github.com/thymeleaf/thymeleaf-extras-conditionalcomments/blob/thymeleaf-extras-conditionalcomments-1.0.0-beta1/src/main/java/org/thymeleaf/extras/conditionalcomments/parser/ConditionalCommentAttoTemplateParser.java which you might be able to use, copy, or base on for creating your own parser and acquire the ability to use XML-ill-formed code.


Regards,
Daniel.



Reply | Threaded
Open this post in threaded view
|

Re: NekoHTML html compressing removes spaces between tags

tixe
Hi Daniel,

Thanks a lot for the amazingly-detailed reply.

We'll evaluate what to do now. Probably XML-valid HTML5 is the right choice and I'll be waiting for Thymeleaf's attoparser support very much.

I wish I had some time to try out other options but I can't right now.

What I can tell you is that overall we're very happy with spring+thymeleaf and I guess this is going to be sort of like our company's standard for Java based webapps.

Keep up with the good work!

Thanks

-m