CURL import character encoding problem
Like Jon Skeet pointed it's difficult to understand your situation, however if you have access only to final text, you can try to use iconv for changing text encoding.
I.e.
$text = iconv("Windows-1252","UTF-8",$text);
I've had similar issue time ago (with Italian language and special chars) and I've solved it in this way.
Try different combination (UTF-8, ISO-8859-1, Windows-1252).
PHP Curl return character
I found these two similar SO posts that may be helpful:
PHP Curl UTF-8 Charset
CURL import character encoding problem
Curl encoding issues with command line
This is due to handling of unicode characters in a DOS prompt, see Unicode characters in Windows command line - how?. You should be able to change this behavior by using a command like chcp 65001
to set the terminal up for UTF-8 handling.
How to include an '&' character in a bash curl statement
Putting single quotes around the &
symbol seems to work. That is, using a URL like http://www.example.com/page.asp?arg1=${i}'&'arg2=${j}
with curl returns the requested webpage.
How to encode foreign characters for importing to shopify via API using PHP and CURL
If you are using a mysqli
database connection to fetch the data from Magento, you may need to set the charset of the connection to utf8 so that PHP gets the data correctly from the database:
$mysqli->set_charset("utf8")
Scraping meta data on Japanese websites with some character encoding problems
Looks like even tough all pages declared using UTF-8, some ISO-8859-1 was hidden in places. Using iconv solved the issue.
Edited the question with all the details, case closed !
php: file_get_contents encoding problem
First off, is your browser set to UTF-8? In Firefox you can set your text encoding in View->Character Encoding. Make sure you have "Unicode (UTF-8)" selected. I would also set View->Character Encoding->Auto-Detect to "Universal."
Secondly, you could try passing the FILE_TEXT flag, like so:
$page = file_get_contents('http://translate.google.com/translate_t', FILE_TEXT, $context);
How to urlencode data for curl command?
Use curl --data-urlencode
; from man curl
:
This posts data, similar to the other
--data
options with the exception that this performs URL-encoding. To be CGI-compliant, the<data>
part should begin with a name followed by a separator and a content specification.
Example usage:
curl \
--data-urlencode "paramName=value" \
--data-urlencode "secondParam=value" \
http://example.com
See the man page for more info.
This requires curl 7.18.0 or newer (released January 2008). Use curl -V
to check which version you have.
You can as well encode the query string:
curl --get \
--data-urlencode "p1=value 1" \
--data-urlencode "p2=value 2" \
http://example.com
# http://example.com?p1=value%201&p2=value%202
Why can Haskell not handle characters from a specific website?
Since you said you are interested in just the links, there is no need to convert the GBK encoding to Unicode.
Here is a version which prints out all links like "123456.html" in the document:
#!/usr/bin/env stack
{- stack
--resolver lts-6.0 --install-ghc runghc
--package wreq --package lens
--package tagsoup
-}
{-# LANGUAGE OverloadedStrings #-}
import Network.Wreq
import qualified Data.ByteString.Lazy.Char8 as LBS
import Control.Lens
import Text.HTML.TagSoup
import Data.Char
import Control.Monad
-- match \d+\.html
isNumberHtml lbs = (LBS.dropWhile isDigit lbs) == ".html"
wanted t = isTagOpenName "a" t && isNumberHtml (fromAttrib "href" t)
main = do
r <- get "http://www.piaotian.net/html/7/7430/"
let body = r ^. responseBody :: LBS.ByteString
tags = parseTags body
links = filter wanted tags
hrefs = map (fromAttrib "href") links
forM_ hrefs LBS.putStrLn
Related Topics
PHP Passing Parameters via Url
Upload Xls or Xlsx Files with Codeigniter, Mime-Type Error
Setting a Cookie in an Ajax Request
Saving File Using Curl and PHP
Differencebetween $_Files["File"]["Type"] and End(Explode(".", $_Files["File"]["Name"]))
Sparql Query to Get All Parent of a Node
Sort Array by Value Alphabetically PHP
Phpmailer Attachment, Doing It Without a Physical File
How to Upload a File Using Jquery's $.Ajax Function with JSON and PHP
Phpstorm 2020.2 - PHP Built-In Functions Are Not Recognized
How to Disable or Encrypt "View Source" for My Site
Run PHP Function Inside Jquery Click
How to Get a PHP Value from an HTML Form
PHP Eval That Evaluates HTML & PHP
Loop Through Wordpress Posts, and Wrap Each X Post in a Div
Codeigniter Redirect -- the Uri You Submitted Has Disallowed Characters
How to Have Multiple $_Get with the Same Key, Different Values