ruby - Detect encoding -


I get some string data from the web, and I suspect that this is not always what it says. I do not know where the problem is, and I just do not care about anyone else. From one day on this project I am fighting Ruby string encoding. I really have to say something: "This is a string, what is it?", And then use that data to get it in UTF-8 so that it is gsub () 2,000 lines explode Do not check in the depth of my app below, I've tried the rchardet , but even though it works for 1.9, but it only gives input with several bytes ... which is not useful is.

The string is impossible to say whether it is encoding. You always need some additional metadata that tells you what the string is encoding.

If you get a string from the Web, that metadata is in HTTP headings. If the HTTP headers are incorrect, then there is absolutely nothing that you or Ruby or someone else can do. You need to enter a bug with the webmaster of the site from which you get the string and wait until it does not fix it. If you have a service level agreement with the website, file a bug, wait a week, then sue them.


Comments