CSVファイルの扱いに関する挙動の比較(2)
前回、以下の記事を書いた。
その中で、「自身が書き出したCSVファイルを読み込むとエラーになる」可能性が出たので、実際に検証してみた。 なお、今回は、PHPについては7.4.0以降を対象とする。
- 事前準備
- fgetcsv/fputcsv (PHP (7.4.0以降))
- Super CSV (Java)
- Super CSV Annotation
- OrangeSignal CSV (Java)
- opencsv (Java)
- まとめ
事前準備
以下のデータを共通のデータとして使用する。
- CsvTestData.php
<?php class CsvTestData { private static $data = array( array( array('なんてことない文字列', 'abc!#$%&\'()-=^~@`[]{};+:*,.<>/?_123'), ), array( array('空文字列', ''), ), array( array('カンマ', ','), ), array( array('ダブルクォーテーション', '"'), ), array( array('バックスラッシュ', '\\'), ), array( array('空白文字', ' bbb '), ), array( array('改行', "a\nb\r\nc"), ), array( array('テスト1', 'a"b c,d'), ), array( array('テスト2', 'a"b"c d,e,f'), ), array( array('テスト3', 'a\"b\ c\,d'), ), array( array('テスト4', 'a\"b\"c\ d,e,f'), ), ); /** * @return int */ public static function size() { return count(self::$data); } /** * @param int $idx * @return */ public static function get($idx) { return self::$data[$idx]; } }
- CsvTestData.java
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class CsvTestData { private static String[][][] data = { { { "なんてことない文字列", "abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123" }, }, { { "空文字列", "" }, }, { { "カンマ", "," }, }, { { "ダブルクォーテーション", "\"" }, }, { { "バックスラッシュ", "\\" }, }, { { "空白文字", " bbb " }, }, { { "改行", "a\nb\r\nc" }, }, { { "テスト1", "a\"b c,d" }, }, { { "テスト2", "a\"b\"c d,e,f" }, }, { { "テスト3", "a\\\"b\\ c\\,d" }, }, { { "テスト4", "a\\\"b\\\"c\\ d,e,f" }, }, }; public static int size() { return data.length; } public static List<List<String>> get(int idx) { List<List<String>> res = new ArrayList<>(); for (int i = 0; i < data[idx].length; ++i) { res.add(new ArrayList<>(Arrays.asList(data[idx][i]))); } return res; } }
また、Java用に、PHPのfile_get_contentsに相当するライブラリを用意する。
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.InputStreamReader; import java.io.IOException; public class Utils { public static String getFileContents(String filename, String fileEncoding) throws IOException { BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(filename), fileEncoding)); char[] buf = new char[1024]; StringBuilder sb = new StringBuilder(); int len; while ((len = reader.read(buf)) > 0) { sb.append(buf, 0, len); } return sb.toString(); } }
fgetcsv/fputcsv (PHP (7.4.0以降))
なぜ7.4.0以降かというと、fputcsvのescapeに空文字列を指定したいため。7.4.0より前だと、escapeには1文字を指定しないとダメという警告が出てしまう。
- Main.php
<?php mb_internal_encoding('UTF-8'); require_once('CsvTestData.php'); for ($i = 0; $i < CsvTestData::size(); ++$i) { $filename = "test${i}.csv"; $rows = CsvTestData::get($i); $original = $rows; // 書き込み $fp = fopen($filename, "w"); foreach ($rows as $row) { foreach ($row as $key => $val) { $row[$key] = mb_convert_encoding($val, 'SJIS-win', 'UTF-8'); } fputcsv($fp, $row, ',', '"', ''); } fflush($fp); fclose($fp); // ファイルの内容 printf("[%d]\n", $i); printf("file = %s\n", var_export(mb_convert_encoding(file_get_contents($filename), 'UTF-8', 'SJIS-win'), true)); // 読み込み $fp = fopen($filename, "r"); for ($j = 0; ($row = fgetcsv($fp, 0, ',', '"', '')); ++$j) { foreach ($row as $key => $val) { $row[$key] = mb_convert_encoding($val, 'UTF-8', 'SJIS-win'); } printf("original = %s\n", var_export($original[$j], true)); printf("row = %s\n", var_export($row, true)); printf("result = %s\n", ($row === $original[$j] ? "O" : "X")); } fclose($fp); }
実行結果。
[0] file = 'なんてことない文字列,"abc!#$%&\'()-=^~@`[]{};+:*,.<>/?_123" ' original = array ( 0 => 'なんてことない文字列', 1 => 'abc!#$%&\'()-=^~@`[]{};+:*,.<>/?_123', ) row = array ( 0 => 'なんてことない文字列', 1 => 'abc!#$%&\'()-=^~@`[]{};+:*,.<>/?_123', ) result = O [1] file = '空文字列, ' original = array ( 0 => '空文字列', 1 => '', ) row = array ( 0 => '空文字列', 1 => '', ) result = O [2] file = 'カンマ,"," ' original = array ( 0 => 'カンマ', 1 => ',', ) row = array ( 0 => 'カンマ', 1 => ',', ) result = O [3] file = 'ダブルクォーテーション,"""" ' original = array ( 0 => 'ダブルクォーテーション', 1 => '"', ) row = array ( 0 => 'ダブルクォーテーション', 1 => '"', ) result = O [4] file = 'バックスラッシュ,\\ ' original = array ( 0 => 'バックスラッシュ', 1 => '\\', ) row = array ( 0 => 'バックスラッシュ', 1 => '\\', ) result = O [5] file = '空白文字," bbb " ' original = array ( 0 => '空白文字', 1 => ' bbb ', ) row = array ( 0 => '空白文字', 1 => ' bbb ', ) result = O [6] file = '改行,"a b c" ' original = array ( 0 => '改行', 1 => 'a b c', ) row = array ( 0 => '改行', 1 => 'a b c', ) result = O [7] file = 'テスト1,"a""b c,d" ' original = array ( 0 => 'テスト1', 1 => 'a"b c,d', ) row = array ( 0 => 'テスト1', 1 => 'a"b c,d', ) result = O [8] file = 'テスト2,"a""b""c d,e,f" ' original = array ( 0 => 'テスト2', 1 => 'a"b"c d,e,f', ) row = array ( 0 => 'テスト2', 1 => 'a"b"c d,e,f', ) result = O [9] file = 'テスト3,"a\\""b\\ c\\,d" ' original = array ( 0 => 'テスト3', 1 => 'a\\"b\\ c\\,d', ) row = array ( 0 => 'テスト3', 1 => 'a\\"b\\ c\\,d', ) result = O [10] file = 'テスト4,"a\\""b\\""c\\ d,e,f" ' original = array ( 0 => 'テスト4', 1 => 'a\\"b\\"c\\ d,e,f', ) row = array ( 0 => 'テスト4', 1 => 'a\\"b\\"c\\ d,e,f', ) result = O
全てのテストにパスしている。
Super CSV (Java)
- Main.java
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.util.ArrayList; import java.util.List; import org.supercsv.io.CsvListReader; import org.supercsv.io.CsvListWriter; import org.supercsv.io.ICsvListReader; import org.supercsv.io.ICsvListWriter; import org.supercsv.prefs.CsvPreference; public class Main { private static final String FILE_ENCODING = "Windows-31J"; public static void main(String[] args) throws IOException { for (int i = 0; i < CsvTestData.size(); ++i) { String filename = "test" + i + ".csv"; List<List<String>> original = CsvTestData.get(i); // 書き込み writeCsv(original, filename); // ファイルの内容 String fileContents = Utils.getFileContents(filename, FILE_ENCODING); // 読み込み List<List<String>> rows = new ArrayList<>(); try { rows = readCsv(filename); } catch (Exception e) { e.printStackTrace(); } System.out.printf("[%d]%n", i); System.out.printf("file = '%s'%n", fileContents); if (original.size() == rows.size()) { for (int j = 0; j < rows.size(); ++j) { System.out.printf("original = %s%n", original.get(j)); System.out.printf("row = %s%n", rows.get(j)); System.out.printf("result = %s%n", (rows.get(j).equals(original.get(j)) ? "O" : "X")); } } else { System.out.printf("original = %s%n", original); System.out.printf("row = %s%n", rows); System.out.printf("result = %s%n", "X"); } } } private static void writeCsv(List<List<String>> rows, String filename) throws IOException { CsvPreference csvPref = new CsvPreference.Builder( CsvPreference.STANDARD_PREFERENCE).surroundingSpacesNeedQuotes(true).build(); ICsvListWriter writer = new CsvListWriter( new OutputStreamWriter(new FileOutputStream(filename), FILE_ENCODING), csvPref); for (List<String> row : rows) { writer.write(row); } writer.flush(); writer.close(); } private static List<List<String>> readCsv(String filename) throws IOException { CsvPreference csvPref = new CsvPreference.Builder( CsvPreference.STANDARD_PREFERENCE).surroundingSpacesNeedQuotes(true).build(); ICsvListReader reader = new CsvListReader( new BufferedReader(new InputStreamReader(new FileInputStream(filename), FILE_ENCODING)), csvPref); List<List<String>> res = new ArrayList<>(); List<String> row; while ((row = reader.read()) != null) { res.add(row); } reader.close(); return res; } }
実行結果。
[0] file = 'なんてことない文字列,"abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123" ' original = [なんてことない文字列, abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] row = [なんてことない文字列, abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] result = O [1] file = '空文字列, ' original = [空文字列, ] row = [空文字列, null] result = X [2] file = 'カンマ,"," ' original = [カンマ, ,] row = [カンマ, ,] result = O [3] file = 'ダブルクォーテーション,"""" ' original = [ダブルクォーテーション, "] row = [ダブルクォーテーション, "] result = O [4] file = 'バックスラッシュ,\ ' original = [バックスラッシュ, \] row = [バックスラッシュ, \] result = O [5] file = '空白文字," bbb " ' original = [空白文字, bbb ] row = [空白文字, bbb ] result = O [6] file = '改行,"a b c" ' original = [改行, a b c] row = [改行, a b c] result = X [7] file = 'テスト1,"a""b c,d" ' original = [テスト1, a"b c,d] row = [テスト1, a"b c,d] result = O [8] file = 'テスト2,"a""b""c d,e,f" ' original = [テスト2, a"b"c d,e,f] row = [テスト2, a"b"c d,e,f] result = O [9] file = 'テスト3,"a\""b\ c\,d" ' original = [テスト3, a\"b\ c\,d] row = [テスト3, a\"b\ c\,d] result = O [10] file = 'テスト4,"a\""b\""c\ d,e,f" ' original = [テスト4, a\"b\"c\ d,e,f] row = [テスト4, a\"b\"c\ d,e,f] result = O
全てのテストにパスしていると言いたいところだが、空文字列に対してnullで返す仕様なのが惜しい。
改行コードは、ファイルに書き込む際に変換されてしまうらしい。
$ od -cx test6.csv 0000000 211 374 215 s , " a \r \n b \r \n c " \r \n fc89 738d 222c 0d61 620a 0a0d 2263 0a0d 0000020
「"a\nb\r\nc"」が「"a\r\nb\r\nc"」になっている。更に読み込んだ際に、今回もCentOS 7環境で実行したため、「"a\nb\nc"」に変換されている。
Super CSV Annotation
- HogeBean.java
import com.github.mygreen.supercsv.annotation.CsvBean; import com.github.mygreen.supercsv.annotation.CsvColumn; @CsvBean(header=false) public class HogeBean { @CsvColumn(number=1) private String fieldA; @CsvColumn(number=2) private String fieldB; public HogeBean() { } public String getFieldA() { return fieldA; } public void setFieldA(String fieldA) { this.fieldA = fieldA; } public String getFieldB() { return fieldB; } public void setFieldB(String fieldB) { this.fieldB = fieldB; } public boolean equals(Object o) { if (o instanceof HogeBean) { HogeBean that = (HogeBean)o; return (this.fieldA == null && that.fieldA == null || this.fieldA != null && this.fieldA.equals(that.fieldA)) && (this.fieldB == null && that.fieldB == null || this.fieldB != null && this.fieldB.equals(that.fieldB)); } return false; } public String toString() { return "[" + fieldA + "][" + fieldB + "]"; } }
- Main.java
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.util.ArrayList; import java.util.List; import com.github.mygreen.supercsv.io.CsvAnnotationBeanReader; import com.github.mygreen.supercsv.io.CsvAnnotationBeanWriter; import org.supercsv.prefs.CsvPreference; public class Main { private static final String FILE_ENCODING = "Windows-31J"; public static void main(String[] args) throws IOException { for (int i = 0; i < CsvTestData.size(); ++i) { String filename = "test" + i + ".csv"; List<List<String>> tmpOriginal = CsvTestData.get(i); List<HogeBean> original = new ArrayList<>(); for (List<String> list : tmpOriginal) { HogeBean hoge = new HogeBean(); hoge.setFieldA(list.get(0)); hoge.setFieldB(list.get(1)); original.add(hoge); } // 書き込み writeCsv(original, filename); // ファイルの内容 String fileContents = Utils.getFileContents(filename, FILE_ENCODING); // 読み込み List<HogeBean> rows = new ArrayList<>(); try { rows = readCsv(filename); } catch (Exception e) { e.printStackTrace(); } System.out.printf("[%d]%n", i); System.out.printf("file = '%s'%n", fileContents); if (original.size() == rows.size()) { for (int j = 0; j < rows.size(); ++j) { System.out.printf("original = %s%n", original.get(j)); System.out.printf("row = %s%n", rows.get(j)); System.out.printf("result = %s%n", (rows.get(j).equals(original.get(j)) ? "O" : "X")); } } else { System.out.printf("original = %s%n", original); System.out.printf("row = %s%n", rows); System.out.printf("result = %s%n", "X"); } } } private static void writeCsv(List<HogeBean> rows, String filename) throws IOException { CsvPreference csvPref = new CsvPreference.Builder( CsvPreference.STANDARD_PREFERENCE).surroundingSpacesNeedQuotes(true).build(); CsvAnnotationBeanWriter<HogeBean> writer = new CsvAnnotationBeanWriter<>( HogeBean.class, new OutputStreamWriter(new FileOutputStream(filename), FILE_ENCODING), csvPref); for (HogeBean hoge : rows) { writer.write(hoge); } writer.flush(); writer.close(); } private static List<HogeBean> readCsv(String filename) throws IOException { CsvPreference csvPref = new CsvPreference.Builder( CsvPreference.STANDARD_PREFERENCE).surroundingSpacesNeedQuotes(true).build(); CsvAnnotationBeanReader<HogeBean> reader = new CsvAnnotationBeanReader<>( HogeBean.class, new BufferedReader(new InputStreamReader(new FileInputStream(filename), FILE_ENCODING)), csvPref); List<HogeBean> res = new ArrayList<>(); HogeBean hoge; while ((hoge = reader.read()) != null) { res.add(hoge); } reader.close(); return res; } }
実行結果。
[0] file = 'なんてことない文字列,"abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123" ' original = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] row = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] result = O [1] file = '空文字列, ' original = [空文字列][] row = [空文字列][null] result = X [2] file = 'カンマ,"," ' original = [カンマ][,] row = [カンマ][,] result = O [3] file = 'ダブルクォーテーション,"""" ' original = [ダブルクォーテーション]["] row = [ダブルクォーテーション]["] result = O [4] file = 'バックスラッシュ,\ ' original = [バックスラッシュ][\] row = [バックスラッシュ][\] result = O [5] file = '空白文字," bbb " ' original = [空白文字][ bbb ] row = [空白文字][ bbb ] result = O [6] file = '改行,"a b c" ' original = [改行][a b c] row = [改行][a b c] result = X [7] file = 'テスト1,"a""b c,d" ' original = [テスト1][a"b c,d] row = [テスト1][a"b c,d] result = O [8] file = 'テスト2,"a""b""c d,e,f" ' original = [テスト2][a"b"c d,e,f] row = [テスト2][a"b"c d,e,f] result = O [9] file = 'テスト3,"a\""b\ c\,d" ' original = [テスト3][a\"b\ c\,d] row = [テスト3][a\"b\ c\,d] result = O [10] file = 'テスト4,"a\""b\""c\ d,e,f" ' original = [テスト4][a\"b\"c\ d,e,f] row = [テスト4][a\"b\"c\ d,e,f] result = O
これもSuper CSVと同じく、空文字列をnullにしてしまうところが惜しい。あとは改行コードも。
OrangeSignal CSV (Java)
- HogeBean.java
import com.orangesignal.csv.annotation.CsvColumn; import com.orangesignal.csv.annotation.CsvEntity; @CsvEntity(header=false) public class HogeBean { @CsvColumn(position=0) private String fieldA; @CsvColumn(position=1) private String fieldB; public HogeBean() { } public String getFieldA() { return fieldA; } public void setFieldA(String fieldA) { this.fieldA = fieldA; } public String getFieldB() { return fieldB; } public void setFieldB(String fieldB) { this.fieldB = fieldB; } public boolean equals(Object o) { if (o instanceof HogeBean) { HogeBean that = (HogeBean)o; return (this.fieldA == null && that.fieldA == null || this.fieldA != null && this.fieldA.equals(that.fieldA)) && (this.fieldB == null && that.fieldB == null || this.fieldB != null && this.fieldB.equals(that.fieldB)); } return false; } public String toString() { return "[" + fieldA + "][" + fieldB + "]"; } }
- Main.java
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.util.ArrayList; import java.util.List; import com.orangesignal.csv.annotation.CsvColumnException; import com.orangesignal.csv.CsvConfig; import com.orangesignal.csv.CsvReader; import com.orangesignal.csv.CsvWriter; import com.orangesignal.csv.io.CsvEntityReader; import com.orangesignal.csv.io.CsvEntityWriter; public class Main { private static final String FILE_ENCODING = "Windows-31J"; public static void main(String[] args) throws IOException { for (int i = 0; i < CsvTestData.size(); ++i) { String filename = "test" + i + ".csv"; List<List<String>> tmpOriginal = CsvTestData.get(i); List<HogeBean> original = new ArrayList<>(); for (List<String> list : tmpOriginal) { HogeBean hoge = new HogeBean(); hoge.setFieldA(list.get(0)); hoge.setFieldB(list.get(1)); original.add(hoge); } // 書き込み writeCsv(original, filename); // ファイルの内容 String fileContents = Utils.getFileContents(filename, FILE_ENCODING); // 読み込み List<HogeBean> rows = new ArrayList<>(); try { rows = readCsv(filename); } catch (Exception e) { e.printStackTrace(); } System.out.printf("[%d]%n", i); System.out.printf("file = '%s'%n", fileContents); if (original.size() == rows.size()) { for (int j = 0; j < rows.size(); ++j) { System.out.printf("original = %s%n", original.get(j)); System.out.printf("row = %s%n", rows.get(j)); System.out.printf("result = %s%n", (rows.get(j).equals(original.get(j)) ? "O" : "X")); } } else { System.out.printf("original = %s%n", original); System.out.printf("row = %s%n", rows); System.out.printf("result = %s%n", "X"); } } } private static void writeCsv(List<HogeBean> rows, String filename) throws IOException { CsvConfig cfg = new CsvConfig(',', '"', '"'); CsvEntityWriter<HogeBean> writer = CsvEntityWriter.newInstance(new CsvWriter( new OutputStreamWriter(new FileOutputStream(filename), FILE_ENCODING), cfg), HogeBean.class); for (HogeBean hoge : rows) { writer.write(hoge); } writer.flush(); writer.close(); } private static List<HogeBean> readCsv(String filename) throws IOException { CsvConfig cfg = new CsvConfig(',', '"', '"'); CsvEntityReader<HogeBean> reader = CsvEntityReader.newInstance(new CsvReader( new BufferedReader(new InputStreamReader(new FileInputStream(filename), "Windows-31J")), cfg), HogeBean.class); List<HogeBean> res = new ArrayList<>(); while (true) { HogeBean hoge; try { if ((hoge = reader.read()) == null) { break; } res.add(hoge); } catch (CsvColumnException e) { e.printStackTrace(); continue; } catch (RuntimeException e) { // 最大限の配慮 e.printStackTrace(); break; } } reader.close(); return res; } }
実行結果。
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [0] file = '"なんてことない文字列","abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123" ' original = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] row = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [1] file = '"空文字列","" ' original = [空文字列][] row = [空文字列][] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [2] file = '"カンマ","," ' original = [カンマ][,] row = [カンマ][,] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [3] file = '"ダブルクォーテーション","""" ' original = [ダブルクォーテーション]["] row = [ダブルクォーテーション]["] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [4] file = '"バックスラッシュ","\" ' original = [バックスラッシュ][\] row = [バックスラッシュ][\] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [5] file = '"空白文字"," bbb " ' original = [空白文字][ bbb ] row = [空白文字][ bbb ] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [6] file = '"改行","a b c" ' original = [改行][a b c] row = [改行][a b c] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [7] file = '"テスト1","a""b c,d" ' original = [テスト1][a"b c,d] row = [テスト1][a"b c,d] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [8] file = '"テスト2","a""b""c d,e,f" ' original = [テスト2][a"b"c d,e,f] row = [テスト2][a"b"c d,e,f] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [9] file = '"テスト3","a\""b\ c\,d" ' original = [テスト3][a\"b\ c\,d] row = [テスト3][a\"b\ c\,d] result = O java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at com.orangesignal.csv.io.CsvEntityReader.convert(CsvEntityReader.java:300) at com.orangesignal.csv.io.CsvEntityReader.read(CsvEntityReader.java:198) at Main.readCsv(Main.java:82) at Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) [10] file = '"テスト4","a\""b\""c\ d,e,f" ' original = [テスト4][a\"b\"c\ d,e,f] row = [テスト4][a\"b\"c\ d,e,f] result = O
結果だけ見ると全てOKに見えるけど、IndexOutOfBoundsExceptionが発生するのが最大の難点。 自身が吐き出したCSVファイルを読み込んでエラーになるってどんなだ。
opencsv (Java)
- HogeBean.java
import com.opencsv.bean.CsvBindByPosition; public class HogeBean { @CsvBindByPosition(position=0) private String fieldA; @CsvBindByPosition(position=1) private String fieldB; public HogeBean() { } public String getFieldA() { return fieldA; } public void setFieldA(String fieldA) { this.fieldA = fieldA; } public String getFieldB() { return fieldB; } public void setFieldB(String fieldB) { this.fieldB = fieldB; } public boolean equals(Object o) { if (o instanceof HogeBean) { HogeBean that = (HogeBean)o; return (this.fieldA == null && that.fieldA == null || this.fieldA != null && this.fieldA.equals(that.fieldA)) && (this.fieldB == null && that.fieldB == null || this.fieldB != null && this.fieldB.equals(that.fieldB)); } return false; } public String toString() { return "[" + fieldA + "][" + fieldB + "]"; } }
- Main.java
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.io.Reader; import java.io.Writer; import java.util.ArrayList; import java.util.List; import com.opencsv.bean.CsvToBean; import com.opencsv.bean.CsvToBeanBuilder; import com.opencsv.bean.StatefulBeanToCsv; import com.opencsv.bean.StatefulBeanToCsvBuilder; import com.opencsv.exceptions.CsvDataTypeMismatchException; import com.opencsv.exceptions.CsvRequiredFieldEmptyException; public class Main { private static final String FILE_ENCODING = "Windows-31J"; public static void main(String[] args) throws IOException { for (int i = 0; i < CsvTestData.size(); ++i) { String filename = "test" + i + ".csv"; List<List<String>> tmpOriginal = CsvTestData.get(i); List<HogeBean> original = new ArrayList<>(); for (List<String> list : tmpOriginal) { HogeBean hoge = new HogeBean(); hoge.setFieldA(list.get(0)); hoge.setFieldB(list.get(1)); original.add(hoge); } // 書き込み writeCsv(original, filename); // ファイルの内容 String fileContents = Utils.getFileContents(filename, FILE_ENCODING); // 読み込み List<HogeBean> rows = new ArrayList<>(); try { rows = readCsv(filename); } catch (Exception e) { e.printStackTrace(); } System.out.printf("[%d]%n", i); System.out.printf("file = '%s'%n", fileContents); if (original.size() == rows.size()) { for (int j = 0; j < rows.size(); ++j) { System.out.printf("original = %s%n", original.get(j)); System.out.printf("row = %s%n", rows.get(j)); System.out.printf("result = %s%n", (rows.get(j).equals(original.get(j)) ? "O" : "X")); } } else { System.out.printf("original = %s%n", original); System.out.printf("row = %s%n", rows); System.out.printf("result = %s%n", "X"); } } } private static void writeCsv(List<HogeBean> rows, String filename) throws IOException { Writer w = new OutputStreamWriter(new FileOutputStream(filename), FILE_ENCODING); // flush()するために変数にセット。 StatefulBeanToCsv<HogeBean> writer = new StatefulBeanToCsvBuilder<HogeBean>( w ).build(); for (HogeBean hoge : rows) { try { writer.write(hoge); } catch (CsvDataTypeMismatchException|CsvRequiredFieldEmptyException e) { e.printStackTrace(); } } w.flush(); w.close(); } private static List<HogeBean> readCsv(String filename) throws IOException { Reader r = new BufferedReader(new InputStreamReader(new FileInputStream(filename), FILE_ENCODING)); CsvToBean<HogeBean> reader = new CsvToBeanBuilder<HogeBean>( r ).withType(HogeBean.class).build(); List<HogeBean> res = new ArrayList<>(); try { for (HogeBean hoge : reader) { res.add(hoge); } } catch (RuntimeException e) { e.printStackTrace(); } finally { r.close(); } return res; } }
実行結果。
[0] file = '"なんてことない文字列","abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123" ' original = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] row = [なんてことない文字列][abc!#$%&'()-=^~@`[]{};+:*,.<>/?_123] result = O [1] file = '"空文字列","" ' original = [空文字列][] row = [空文字列][] result = O [2] file = '"カンマ","," ' original = [カンマ][,] row = [カンマ][,] result = O [3] file = '"ダブルクォーテーション","""" ' original = [ダブルクォーテーション]["] row = [ダブルクォーテーション]["] result = O java.lang.RuntimeException: Error capturing CSV header! at com.opencsv.bean.CsvToBean.prepareToReadInput(CsvToBean.java:304) at com.opencsv.bean.CsvToBean.iterator(CsvToBean.java:322) at Main.readCsv(Main.java:89) at Main.main(Main.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) Caused by: com.opencsv.exceptions.CsvMalformedLineException: Unterminated quoted field at end of CSV line. Beginning of lost text: [" ] at com.opencsv.CSVReader.primeNextRecord(CSVReader.java:245) at com.opencsv.CSVReader.flexibleRead(CSVReader.java:598) at com.opencsv.CSVReader.peek(CSVReader.java:574) at com.opencsv.bean.ColumnPositionMappingStrategy.captureHeader(ColumnPositionMappingStrategy.java:72) at com.opencsv.bean.CsvToBean.prepareToReadInput(CsvToBean.java:302) ... 9 more [4] file = '"バックスラッシュ","\" ' original = [[バックスラッシュ][\]] row = [] result = X [5] file = '"空白文字"," bbb " ' original = [空白文字][ bbb ] row = [空白文字][ bbb ] result = O [6] file = '"改行","a b c" ' original = [改行][a b c] row = [改行][a b c] result = X [7] file = '"テスト1","a""b c,d" ' original = [テスト1][a"b c,d] row = [テスト1][a"b c,d] result = O [8] file = '"テスト2","a""b""c d,e,f" ' original = [テスト2][a"b"c d,e,f] row = [テスト2][a"b"c d,e,f] result = O java.lang.RuntimeException: Error capturing CSV header! at com.opencsv.bean.CsvToBean.prepareToReadInput(CsvToBean.java:304) at com.opencsv.bean.CsvToBean.iterator(CsvToBean.java:322) at Main.readCsv(Main.java:89) at Main.main(Main.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.lang.Thread.run(Thread.java:748) Caused by: com.opencsv.exceptions.CsvMalformedLineException: Unterminated quoted field at end of CSV line. Beginning of lost text: [a""b c,d ] at com.opencsv.CSVReader.primeNextRecord(CSVReader.java:245) at com.opencsv.CSVReader.flexibleRead(CSVReader.java:598) at com.opencsv.CSVReader.peek(CSVReader.java:574) at com.opencsv.bean.ColumnPositionMappingStrategy.captureHeader(ColumnPositionMappingStrategy.java:72) at com.opencsv.bean.CsvToBean.prepareToReadInput(CsvToBean.java:302) ... 9 more [9] file = '"テスト3","a\""b\ c\,d" ' original = [[テスト3][a\"b\ c\,d]] row = [] result = X [10] file = '"テスト4","a\""b\""c\ d,e,f" ' original = [テスト4][a\"b\"c\ d,e,f] row = [テスト4][a""b""c d,e,f] result = X
こちらはバックスラッシュ(エスケープ文字)が全滅。