html - How do I convert various user-inputted line break characters to <br> using Perl? -


After

I have a & lt; text area & gt; For user input , and, as they are inviting to do. Users can liberally add line breaks in the browser and I save this data directly to the database

On displaying this data back on a web page, I need to convert and lt to the line breaks; Br> \ n ' \ r \ n and any other normal line breaks planned by the client system keep in mind the tag in a trusted way.

What is the best way to do this in Perl without regex replacements every time? Naturally, I expect another awesome CPAN module recommendation ... ...

  #! Use / usr / bin / perl strict; Use warnings; Use Socket QW (: CRLF); My $ text = "a $ {CR} b $ {CRLF} c $ {LF}"; $ Text = ~ s / $ lf | $ CR $ LF? / & Lt; Br> / G; Print $ text; After the comment of @Daxim, here's the revised version:  
  #! Use / usr / bin / perl strict  

use warnings; Use fourmen's: 'complete'; My $ text = "one \ n {cr} b \ n {cr} \ n {lf} c \ n {lf}"; $ Text = ~ s / \ N {LF} | \ N {CR} \ N {LF}? / & Lt; Br> / G; Print $ text;

@ Marcus's comment here is an unimplemented example:

  #! Use / usr / bin / perl strict; Use warnings; Use fourmen's: 'complete'; My $ t = (my $ s = "a \ 012 \ 015 \ 012b \ 012 \ 012 \ 015 \ 015c"); $ S = ~ s / \ r? \ N / & lt; Br> / G; $ T = ~ s / \ N {LF} | \ N {cr} \ N {LF}? / & Lt; Br> / G; Print "This is \ $ s: $ s \ nThis \ $ t: $ t \ n";  

This is a mismatch of the carriage return and line feeds (which, some time ago, I had encounters).

The script is produced here by using ActiveState Perl on Windows:

  C: \ Temp & gt; T | 3e3c 6272 3e62 3c62 723e 3c62 723e 0d0d & gt; & Lt; Br> P & lt; Br> & Lt; Br> .. 0,000,020: 630d 0a54 6869; XXD 0000000: br 0,000,010 & lt: 5468 6973 2069 7320 2473 3a20 613c 6272 This is $ s 7320 6973 2024 743 A 613 cc. It is $ t: a & lt; 0000030: 6272 3e3c 6272 3e62 3c62 723e 3c62 723e BR & gt; & Lt; Br> B & lt; Br> & Lt; Br> 0000040: 3c62 723e 3c62 723e 630d 0A & lt; Br> & Lt; Br> C. Or, as the text:  
  is $ s:  & Lt; Br> P & lt; Br> & Lt; Br> This is $ t: & lt; Br> & Lt; Br> B & lt; Br> & Lt; Br> & Lt; Br> & Lt; Br> It is true that you are not likely to end up with this input though, if you meet for any unexpected abnormalities that may indicate a line ending. Wish you can use  
  $ s = ~ s / \ N {LF} | \ N {cr} \ N {LF} / & lt ;? Br> / G; In addition, for reference, line-ending can be canonized like this:  
  Define the #rrlf sequence I have a simple "\ R \ n" can not be used because the meaning of "\ n" on different OS # is different (sometimes it creates CRLF, sometimes LF # and sometimes CR). The most popular VMS web server does not accept #CRLF - instead it wants LR. EBCDIC machines do not use ASCII, so the meaning of \ 015 \ 012 is something different. I think all this # is really annoying $ EBCDIC = "\ t" has "\ 011"; If ($ OS E 'VMS') {$ CRLF = "\ n"; } Elsf ($ EBCDIC) {$ CRLF = "\ r \ n"; } And {$ CRLF = "\ 015 \ 012"; }  

Comments