CSV and Character Encoding - Avoid Character Garbling with utf8 with BOM

 learningBOX

I am the representative, Nishimura! It's been a while since I wrote a blog.
To avoid garbled text in Excel, I added a BOM.
 

Content

  • 1.learningBOX is an LMS that supports multiple languages
  • 2.There was a flaw in the multilingual support of CSV.
  • 3.implemented to be able to choose UTF-8
  • 4. BOM was added to prevent garbled characters in Excel.
  • 5. Release schedule
  • 6. Recruiting!

 

learningBOX is an LMS with multilingual support

Our e-learning system "learningBOX" is an LMS that supports multiple languages.
The UI of learningBOX is currently only available in Japanese and English, but the data of teaching materials and learners' answers are not only available in Japanese and English, but also in Chinese, Korean, Vietnamese and other languages around the world.
In fact, teaching materials created in each country's language are used in training programs for foreigners.

Multilingual support - e-learning
 

Multilingual support in CSV was incomplete.

Despite claiming to support multiple languages, the CSV support was inadequate.
In the currently released version (2.14.28), the CSV encoding is fixed to Shift_JIS (Windows-31J).
Therefore, all characters except Japanese, English, and some Chinese and Latin characters will be garbled.

CSV - garbled
 

implemented to be able to choose UTF-8

There was a proposal to switch the character encoding of CSV output from learningBOX to UTF-8, but we decided that the impact on users would be too great if we suddenly changed the specification, so we made it possible to choose between UTF-8 and Shift_JIS for the character encoding of CSV. (Default is Shift_JIS)
Garbled text - e-learning

 

I added a BOM to prevent garbled text in Excel.

When I open a CSV of UTF-8 without BOM in Excel, it is garbled.
BOM is an abbreviation for Byte Order Mark, which indicates that the character code of the file is UTF-8.
Therefore, by using UTF-8 with BOM, you can now open the file in Excel without garbling the characters.

A long time ago, it was common to avoid garbled characters by making it in UTF-16LE, but relatively recent Excel is now better utf8 with BOM. In addition, the character code of the Web world is becoming unified into utf8, considering the ease of handling outside of Excel, I decided to utf8 instead of UTF-16LE.
Reference sites:.How to output Unicode csv that opens correctly in both Win and Mac Excelthan
 

How to add a BOM in PHP

The substance of a BOM is a 3-byte piece of data, represented as \xEF\xBB\xBF. By placing these three bytes at the beginning of a file, you can create a utf8 file with a BOM. in the case of PHP, you can create a file with a BOM by doing the following.
$csv = "\xEF\xBB\xBF".$csv;
 

release schedule

Depending on the results of the QA department's verification, it is expected to be released in an update next week or the week after.
For more information on the release status of the learningBOX, click hereRelease note.
 

We're recruiting!

We are looking for a learningBOX developer.
We have various openings for backend engineers, frontend engineers, quality assurance engineers, project managers, etc.
We are recruiting development engineers mainly at our head office in Tatsuta, but we are actively recruiting infrastructure engineers in Tokyo as well, so please apply if you are interested. If you have studied Linux and networking at an infrastructure school, you'll be welcome!
For more information about Tatsuno Information System's employment opportunities herearticle for more information.
 

Get started with free compliance training!
banner

  • Comment ( 0 )

  • Trackbacks are closed.

  1. No comments yet.

Related posts