当前位置 主页 > 网站技术 > 代码类 >

    PHP读取文件,解决中文乱码UTF-8的方法分析

    栏目:代码类 时间:2020-01-22 15:08

    本文实例讲述了PHP读取文件,解决中文乱码UTF-8的方法。分享给大家供大家参考,具体如下:

    $opts = array(
      'file' => array(
        'encoding' => "utf-8"
      )
    );
    $opts = array('http' => array('encoding' => 'utf-8'));
    $ctxt = stream_context_create($opts);
    $content = file_get_contents($filePath, FILE_TEXT, $ctxt);
    
    

    最简单的就是将GF2312→UTF-8

    $str = iconv("gb2312", "utf-8", $str);
    
    

    不管用的

    $content = mb_convert_encoding($content, "UTF-8", "auto");
    
    

    ******************************************丑陋的分割线来告诉大家上面的不好的:下面的才是正确的方法···哈哈···**********************************************************

    define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));
    define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));
    define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));
    define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));
    define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));
    
    $text = file_get_contents($newPath);
    $first2 = substr($text, 0, 2);
    $first3 = substr($text, 0, 3);
    $first4 = substr($text, 0, 3);
    $encodType = "";
    if ($first3 == UTF8_BOM)
      $encodType = 'UTF-8 BOM';
    else if ($first4 == UTF32_BIG_ENDIAN_BOM)
      $encodType = 'UTF-32BE';
    else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)
      $encodType = 'UTF-32LE';
    else if ($first2 == UTF16_BIG_ENDIAN_BOM)
      $encodType = 'UTF-16BE';
    else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)
      $encodType = 'UTF-16LE';
    
    $content = file_get_contents($newPath);
    
    $content = iconv($encodType, "utf-8", $content);
    
    

    终极版·····

    $text = file_get_contents($filePath);
    //$encodType = mb_detect_encoding($text);
    define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));
    define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));
    define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));
    define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));
    define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));
    $first2 = substr($text, 0, 2);
    $first3 = substr($text, 0, 3);
    $first4 = substr($text, 0, 3);
    $encodType = "";
    if ($first3 == UTF8_BOM)
      $encodType = 'UTF-8 BOM';
    else if ($first4 == UTF32_BIG_ENDIAN_BOM)
      $encodType = 'UTF-32BE';
    else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)
      $encodType = 'UTF-32LE';
    else if ($first2 == UTF16_BIG_ENDIAN_BOM)
      $encodType = 'UTF-16BE';
    else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)
      $encodType = 'UTF-16LE';
    //下面的判断主要还是判断ANSI编码的·
    if ($encodType == '') {//即默认创建的txt文本-ANSI编码的
      $content = iconv("GBK", "UTF-8", $text);
    } else if ($encodType == 'UTF-8 BOM') {//本来就是UTF-8不用转换
      $content = $text;
    } else {//其他的格式都转化为UTF-8就可以了
      $content = iconv($encodType, "UTF-8", $text);
    }
    
    

    以上的终极版·可以适应中文操作windows系统建立的ANSI``````````````UTF-8`````````Unicode`````的txt文本····

    更多关于PHP相关内容感兴趣的读者可查看本站专题:《PHP编码与转码操作技巧汇总》、《PHP数组(Array)操作技巧大全》、《php字符串(string)用法总结》、《php常用函数与技巧总结》及《PHP错误与异常处理方法总结》