读取包含单个文件的大文件的最佳方法

2018-06-28 01:35:17

我为具有单行字符串和多行字符串的Reading文件做了一个小实验。单线文件长度为198890，多线文件长度为208890。我用以下六种方法测试了它们，并获得了它们读取的时间和字符串长度。这里我提到了测试方法，结果和实现。

我的实际考虑是阅读一个包含单行文本的大文件。根据结果，它看起来像IO utils比其他人好。

那么，除了我在下面实现的方法（如果有的话）之外，我可以使用的最佳方式是什么。

结果:(时间以秒为单位，0表示小于1秒））

iOTest(). : Single Line Test...
singleStr.txt is deleted!
writeToFile().198890 lenghted String wrote to the file
[ReadWithBufferedReaderByLine] Text length: 198890, Total time: 18
[ReadWithBufferedReaderToCharArray] Text length: 204800, Total time: 8
[ReadWithStreamToByteArray] Text length: 198890, Total time: 8
[ReadWithStreamToByteArrayChunks] Text length: 1950, Total time: 1
[ReadFromApacheFileUtils] Text length: 198890, Total time: 30
[ReadFromApacheIOUtils] Text length: 198890, Total time: 1

iOTest(). : Multi Line Test...
multiStr.txt is deleted!
writeToFile().208890 lenghted String wrote to the file
[ReadWithBufferedReaderByLine] Text length: 198890, Total time: 15
[ReadWithBufferedReaderToCharArray] Text length: 212992, Total time: 2
[ReadWithStreamToByteArray] Text length: 208890, Total time: 1
[ReadWithStreamToByteArrayChunks] Text length: 2040, Total time: 2
[ReadFromApacheFileUtils] Text length: 208890, Total time: 0
[ReadFromApacheIOUtils] Text length: 208890, Total time: 1

测试方法：

public void iOTester(){

        System.out.println("niOTester(). : Single Line Test...");

        String testStr = "";
        for(int i = 0; i < 10000; i++)  testStr += "[Namal"+i+"Fernando] ";

        writeToFile("singleStr.txt", testStr);

        readWithBufferedReaderByLine("singleStr.txt");
        readWithBufferedReaderToCharArray("singleStr.txt");
        readWithStreamToByteArray("singleStr.txt");
        readWithStreamToByteArrayChunks("singleStr.txt");
        readFromApacheFileUtils("singleStr.txt");
        readFromApacheIOUtils("singleStr.txt");

        System.out.println("niOTester(). : Multi Line Test...");

        testStr = "";
        for(int i = 0; i < 10000; i++) testStr += "[Namal"+i+"Fernando] n";

        writeToFile("multiStr.txt", testStr);

        readWithBufferedReaderByLine("multiStr.txt");
        readWithBufferedReaderToCharArray("multiStr.txt");
        readWithStreamToByteArray("multiStr.txt");
        readWithStreamToByteArrayChunks("multiStr.txt");
        readFromApacheFileUtils("multiStr.txt");
        readFromApacheIOUtils("multiStr.txt");


    }

实现：

方法1：（ReadWithBufferedReaderByLine）

BufferedReader  br          = new BufferedReader(new 

FileReader(file));
String          line        = null;
StringBuilder   sb          = new StringBuilder();

while ((line = br.readLine()) != null) {
    sb.append(line);
}
String          text        = sb.toString();

方法2：（ReadWithBufferedReaderToCharArray）

BufferedReader  br              = new BufferedReader(new 

FileReader(file));
StringBuilder   sb          = new StringBuilder();
char[]          chars       = new char[8192];

for(int len; (len = br.read(chars)) > 0;) {
    sb.append(String.valueOf(chars));
}
String          text        = sb.toString();

方法3：（ReadWithStreamToByteArray）

InputStream     is          = new FileInputStream(file);
byte[]          b       = new byte[is.available()];
is.read(b);
String          text        = new String(b);

方法4：（ReadWithStreamToByteArrayChunks）

InputStream     is          = new FileInputStream(file);
byte[]          b       = new byte[1024];
StringBuilder   sb          = new StringBuilder();

int read;
while((read = is.read(b)) != -1){
    sb.append(String.valueOf(b));
}

String          text        = sb.toString();

方法5：（ReadFromApacheFileUtils）

String text  = new String(FileUtils.readFileToByteArray(new File(filePath)));

方法6：（ReadFromApacheIOUtils）

String text = new String(IOUtils.toByteArray(new FileInputStream(filePath)));

参考：

从文本文件中读取大量的字符串

如何从Java中的文件读取ByteArray？

你也可以测试这个方法

String text = new String(Files.readAllBytes(Paths.get(path)));

还有带有直接缓冲区的FileChannel

    FileChannel fc = FileChannel.open(path);
    ByteBuffer buf = ByteBuffer.allocateDirect((int)fc.size());
    fc.read(buf);

链接地址: http://www.djcxy.com/p/78411.html

上一篇: Best way to read a large file containing a single

下一篇: GHC GC'ing sparks