After
I have a & lt; text area & gt; For user input
, and, as they are inviting to do. Users can liberally add line breaks in the browser and I save this data directly to the database
On displaying this data back on a web page, I need to convert and lt to the line breaks; Br>
tag in a trusted way. \ n
' \ r \ n
and any other normal line breaks planned by the client system keep in mind the
What is the best way to do this in Perl without regex replacements every time? Naturally, I expect another awesome CPAN module recommendation ... ...
#! Use / usr / bin / perl strict; Use warnings; Use Socket QW (: CRLF); My $ text = "a $ {CR} b $ {CRLF} c $ {LF}"; $ Text = ~ s / $ lf | $ CR $ LF? / & Lt; Br> / G; Print $ text; After the comment of @Daxim, here's the revised version: #! Use / usr / bin / perl strict
use warnings; Use fourmen's: 'complete'; My $ text = "one \ n {cr} b \ n {cr} \ n {lf} c \ n {lf}"; $ Text = ~ s / \ N {LF} | \ N {CR} \ N {LF}? / & Lt; Br> / G; Print $ text;
@ Marcus's comment here is an unimplemented example:
#! Use / usr / bin / perl strict; Use warnings; Use fourmen's: 'complete'; My $ t = (my $ s = "a \ 012 \ 015 \ 012b \ 012 \ 012 \ 015 \ 015c"); $ S = ~ s / \ r? \ N / & lt; Br> / G; $ T = ~ s / \ N {LF} | \ N {cr} \ N {LF}? / & Lt; Br> / G; Print "This is \ $ s: $ s \ nThis \ $ t: $ t \ n";
This is a mismatch of the carriage return and line feeds (which, some time ago, I had encounters).
The script is produced here by using ActiveState Perl on Windows:
C: \ Temp & gt; T | 3e3c 6272 3e62 3c62 723e 3c62 723e 0d0d & gt; & Lt; Br> P & lt; Br> & Lt; Br> .. 0,000,020: 630d 0a54 6869; XXD 0000000: br 0,000,010 & lt: 5468 6973 2069 7320 2473 3a20 613c 6272 This is $ s 7320 6973 2024 743 A 613 cc. It is $ t: a & lt; 0000030: 6272 3e3c 6272 3e62 3c62 723e 3c62 723e BR & gt; & Lt; Br> B & lt; Br> & Lt; Br> 0000040: 3c62 723e 3c62 723e 630d 0A & lt; Br> & Lt; Br> C. Or, as the text: is $ s: & Lt; Br> P & lt; Br> & Lt; Br> This is $ t: & lt; Br> & Lt; Br> B & lt; Br> & Lt; Br> & Lt; Br> & Lt; Br> It is true that you are not likely to end up with this input though, if you meet for any unexpected abnormalities that may indicate a line ending. Wish you can use $ s = ~ s / \ N {LF} | \ N {cr} \ N {LF} / & lt ;? Br> / G; In addition, for reference, line-ending can be canonized like this: Define the #rrlf sequence I have a simple "\ R \ n" can not be used because the meaning of "\ n" on different OS # is different (sometimes it creates CRLF, sometimes LF # and sometimes CR). The most popular VMS web server does not accept #CRLF - instead it wants LR. EBCDIC machines do not use ASCII, so the meaning of \ 015 \ 012 is something different. I think all this # is really annoying $ EBCDIC = "\ t" has "\ 011"; If ($ OS E 'VMS') {$ CRLF = "\ n"; } Elsf ($ EBCDIC) {$ CRLF = "\ r \ n"; } And {$ CRLF = "\ 015 \ 012"; }
Comments
Post a Comment