ThomThom's Website

IE ate my linebreaks!

14 Aug 07 - 09:36

I was working on a syntax highlighter to use on my code elements on my website. The script is quite simple; it finds all <code> elements that is contained inside a <pre> element and uses the class attribute of the <code> element to determine what code language it is. I used innerHTML to extract the content of the <code> element. (Yes, I know, innerHTML shouldn't be used with XHTML as it will break if it's served as XML, but that's a whole other topic, and to my defence I've been contemplating switching back to HTML.)

Everything was fine when I tested it in Firefox, but... (You have heard this so many time...) ...in IE, some of my linebreaks was missing. When I sent the string I got from innerHTML I noticed that some, not all, of the linebreaks was missing. I even ran a loop on the string and printed all the character codes, but there was nothing. IE had simply eaten the linebreaks and partially normalized the text even though it was inside a <pre> element.

The hero of the day was the almighty DOM which even IE didn't dare stealing from. When I accessed the text node inside the <code> element I got all the linebreaks. So instead of .innerHTML I used .childNodes[0].nodeValue. It should be noted that this isn't an ideal substitute, as it will not return all the text if it's split up into several text nodes, caused by elements of other DOM nodes. But it's not hard writing a little function to extract all the text nodes and compile them into one string. The important bit is that IE doesn't normalize the string returned and returns all the whitespace as it should.

.split() misbehaviour

Another part of the code I ran into trouble with was when I was splitting each line into an array of strings, which I then used to parse through each item in the array and compile a numbered list so I would get line numbers next to each line of the codeblock.

I had used .split(/\n\r|\r\n|\r|\n/g); to return an array of each line of code, which again worked fine in Firefox and Opera, but not in IE. If there was an empty line IE would simply eat it. So a codeblock of 5 lines, whereas 1 was an empty line, IE would return and array of four.

The solution was to normalize all linebreaks into \n and then split the text. For one reason or another, IE then returned all the lines, even the empty ones.


// === THIS DOES NOT WORK AS EXPECTED IN IE === //
// Split each line into an array;
this.lines = this.codeText.split(/\n\r|\r\n|\r\n/g);

// === THIS WORKS IN IE === //
// Convert all linebreaks to \n
this.codeText = this.codeText.replace(/\n\r|\r\n|\r/g, '\n');
// Split each line into an array;
this.lines = this.codeText.split('\n');

So finally I had a syntax highlighter that works in Firefox 2, IE7 and Opera 9. I've yet to test it in other browers as I'm still working on it. The script is located here. It's still a bit rough and needs some work and commenting.

Hopefully this will be useful to someone. I didn't find much information myself when I had a look on Google.

Comments

No comments

Comments are closed due to spam. Thank you spammers, well done!