ฟังก์ชัน mbrtowc() ในโปรแกรม C/C++

ในบทความนี้ เราจะพูดถึงการทำงาน ไวยากรณ์ และตัวอย่างของฟังก์ชัน std::mbrtowc() ใน C++ STL

std::mbrtowc() คืออะไร

std::mbrtowc() ฟังก์ชั่นเป็นฟังก์ชัน inbuilt ใน C++ STL ซึ่งกำหนดไว้ในไฟล์ส่วนหัว mbrtowc() หมายความว่ามันแปลงสตริงอักขระหลายไบต์แบบแคบเป็นอักขระแบบกว้าง ฟังก์ชันนี้ใช้เพื่อแปลงอักขระหลายไบต์แบบแคบเป็นการแสดงอักขระแบบกว้าง

ไวยากรณ์

size_t mbrtowc( wchar_t* pwc, char* str, size_t n, mbstate_t* ps);

พารามิเตอร์

ฟังก์ชันยอมรับพารามิเตอร์ต่อไปนี้ -

pwc − นี่คือตัวชี้ไปยังตำแหน่งที่เราต้องการให้จัดเก็บเอาต์พุต
str − สตริงอักขระที่ใช้เป็นอินพุต
น − เป็นจำนวนไบต์ที่ต้องตรวจสอบ
ป.ล. − เป็นตัวชี้ไปยังอ็อบเจ็กต์สถานะเมื่อเราแปลสตริงแบบหลายไบต์

คืนค่า

ค่าที่ส่งกลับของฟังก์ชันนี้แตกต่างกันตามเงื่อนไขต่อไปนี้ -

0 − ฟังก์ชันจะคืนค่าศูนย์เมื่ออักขระใน str ที่ต้องแปลงเป็น NULL
1…n − จำนวนไบต์ของอักขระหลายไบต์ซึ่งถูกแปลงจากสตริงอักขระ *str.
-2 − เราจะได้ -2 เมื่อ n ไบต์ถัดไปไม่สมบูรณ์ แต่จนถึงตอนนี้เป็นอักขระแบบหลายไบต์ที่ถูกต้อง
-1 − เราได้รับ -1 เมื่อเราพบข้อผิดพลาดในการเข้ารหัส ไม่มีอะไรถูกเขียนไปยัง *pwc

ตัวอย่าง

#include <bits/stdc++.h>
using namespace std;
void print_(const char* ch){
   mbstate_t temp = mbstate_t();
   int cal = strlen(ch);
   const char* i = ch + cal;
   int total;
   wchar_t con;
   while ((total = mbrtowc(&con, ch, i - ch, &temp)) > 0){
      wcout << "Next " << total <<" bytes are the character " << con << '\n';
      ch += total;
   }
}
int main(){
   setlocale(LC_ALL, "en_US.utf8");
   const char* len = u8"z\u00df\u6c34";
   print_(len);
}

ผลลัพธ์

Next 1 bytes are the character z
Next 2 bytes are the character ß
Next 3 bytes are the character 水

ตัวอย่าง

#include <bits/stdc++.h>
using namespace std;
void print_(const char* ch){
   mbstate_t temp = mbstate_t();
   int cal = strlen(ch);
   const char* i = ch + cal;
   int total;
   wchar_t con;
   while ((total = mbrtowc(&con, ch, i - ch, &temp)) > 0){
      wcout << "Next " << total <<" bytes are the character " << con << '\n';
      ch += total;
   }
}
int main(){
   setlocale(LC_ALL, "en_US.utf8");
   const char* len = u8"\xE2\x88\x83y\xE2\x88\x80x\xC2";
   print_(len);
}

ผลลัพธ์

Next 3 bytes are the character ∃
Next 1 bytes are the character y
Next 3 bytes are the character ∀
Next 1 bytes are the character x