Wechat nickname emoji expression, special expression causes the list not to display, export EXCEL error and other issues to solve!

Keywords: ASP.NET emoji Excel PHP Java

Recent projects, all is normal after online, after a period of time administrators feedback users to export EXCEL error report, the front desk access user list does not display, find the problem is that Weixin nickname, emoji expression caused the error.

Introduction to emoji expression

Because the emoji expression in the wechat interface uses UTF-8 binary string and does not decode, it is shown that when receiving the emoji expression sent by the wechat end user, it is displayed as a square character or a character that cannot be displayed, and then it needs to be transcoded.

In fact, each emoji expression has a corresponding unicode code code. When parsing emoji expression characters in the text sent by users to public numbers, we can match or store emoji expressions in information according to unicode code code code. Similarly, when sending text messages containing emoji expressions to users, we can binary transcode the emoji expression characters according to unicode code code code before sending them.

Find all kinds of online, all PHP and JAVA to test, did not solve the problem, pit ~, continue to look for, and then transform and consult friends to solve this problem.

 

The simple and crude method I used filtered the emoji code directly, and no mistake was found for the time being.

  1  #region Remove emoticons
  2         /// <summary>
  3         /// Remove emoticons
  4         /// </summary>
  5         /// <param name="codePoint"></param>
  6         /// <returns></returns>
  7         public static bool isEmojiCharacter(char codePoint)
  8         {
  9             return (codePoint >= 0x2600 && codePoint <= 0x27BF) // Miscellaneous Symbols and Symbolic Fonts
 10                    || codePoint == 0x303D
 11                    || codePoint == 0x2049
 12                    || codePoint == 0x203C
 13                    || (codePoint >= 0x2000 && codePoint <= 0x200F) //
 14                    || (codePoint >= 0x2028 && codePoint <= 0x202F) //
 15                    || codePoint == 0x205F //
 16                    || (codePoint >= 0x2065 && codePoint <= 0x206F) //
 17                                                                    /* Punctuation occupied area */
 18                    || (codePoint >= 0x2100 && codePoint <= 0x214F) // Alphabetic symbol
 19                    || (codePoint >= 0x2300 && codePoint <= 0x23FF) // Various technical symbols
 20                    || (codePoint >= 0x2B00 && codePoint <= 0x2BFF) // Arrow A
 21                    || (codePoint >= 0x2900 && codePoint <= 0x297F) // Arrow B
 22                    || (codePoint >= 0x3200 && codePoint <= 0x32FF) // Chinese symbols
 23                    || (codePoint >= 0xD800 && codePoint <= 0xDFFF) // High and low substitutes reserved region
 24                    || (codePoint >= 0xE000 && codePoint <= 0xF8FF) // Private Reserved Areas
 25                    || (codePoint >= 0xFE00 && codePoint <= 0xFE0F) // Variant selector
 26                                                                    //   || (codePoint >= U + 2600 && codePoint <= 0xFE0F)
 27                    || codePoint >= 0x10000; // Plane Above the second plane, char Neither can be saved, all can be transferred.
 28 
 29         }
 30         /// <summary>
 31         /// Check if there is emoji character
 32         /// </summary>
 33         /// <param name="source"></param>
 34         /// <returns></returns>
 35         public static bool containsEmoji(String source)
 36         {
 37             if (string.IsNullOrEmpty(source))
 38             {
 39                 return false;
 40             }
 41 
 42             int len = source.Length;
 43 
 44             for (int i = 0; i < len; i++)
 45             {
 46                 char codePoint = source[i];
 47 
 48                 if (isEmojiCharacter(codePoint))
 49                 {
 50                     //do nothing,Judgment here shows that the confirmation of emotive characters
 51                     return true;
 52                 }
 53             }
 54 
 55             return false;
 56         }
 57         /// <summary>
 58         /// filter emoji Characters of other non-literal types
 59         /// </summary>
 60         /// <param name="source">param source</param>
 61         /// <returns></returns>
 62         public static String filterEmoji(String source)
 63         {
 64            if(string.IsNullOrWhiteSpace(source))
 65            {
 66                return "";
 67            }
 68             source = source.Replace("[^\\u0000-\\uFFFF]", "").Replace("??", "");
 69             if (!containsEmoji(source))
 70             {
 71                 return source; //If not, return directly
 72             }
 73             //So here it must contain
 74             StringBuilder buf = null;
 75 
 76             int len = source.Length;
 77 
 78             for (int i = 0; i < len; i++)
 79             {
 80                 char codePoint = source[i];
 81 
 82                 if (!isEmojiCharacter(codePoint))
 83                 {
 84                     if (buf == null)
 85                     {
 86                         buf = new StringBuilder(source.Length);
 87                     }
 88 
 89                     buf.Append(codePoint);
 90                 }
 91                 else
 92                 {
 93                 }
 94             }
 95 
 96             if (buf == null)
 97             {
 98                 return source; //If not found emoji The expression returns the source string
 99             }
100             else
101             {
102                 if (buf.Length == len)
103                 {
104                     //The point here is to do as little as possible. toString,Because strings are regenerated
105                     buf = null;
106                     return source;
107                 }
108                 else
109                 {
110                     return buf.ToString();
111                 }
112             }
113 
114         }
115         #endregion

Reception

 

Success...

At this point, the problem of emoji expression, which is the nickname of Weixin, is solved. The special expression causes the list not to be displayed and the EXCEL error is exported.

Although the code is not the most perfect, there is room for optimization. Thank you very much for "burning ice".

Posted by senyo on Thu, 16 May 2019 06:55:03 -0700