large-files

How to avoid OutOfMemory exception while reading large files in Java

徘徊边缘 提交于 2019-12-01 12:58:51
I am working on the application which reads large amounts of data from a file. Basically, I have a huge file (around 1.5 - 2 gigs) containing different objects (~5 to 10 millions of them per file). I need to read all of them and put them to different maps in the app. The problem is that the app runs out of memory while reading the objects at some point. Only when I set it to use -Xmx4096m - it can handle the file. But if the file will be larger, it won't be able to do that anymore. Here's the code snippet: String sampleFileName = "sample.file"; FileInputStream fileInputStream = null;

Allow Multiple Files Upload on Google Apps Script

被刻印的时光 ゝ 提交于 2019-12-01 12:51:59
Question How do I change this script to allow multiple files to be uploaded or even files bigger than 5 MB? Current script: <!-- Written by Amit Agarwal amit@labnol.org --> <form class="main" id="form" novalidate="novalidate" style="max-width: 480px;margin: 40px auto;"> <div id="forminner"> <div class="row"> <div class="col s12"> <h5 class="center-align teal-text">Submit My Article</h5> <p class="disclaimer">This <a href="http://www.labnol.org/internet/file-upload-google-forms/29170/">File Upload Form</a> (<a href="https://youtu.be/C_YBBupebvE">tutorial</a>) Powered by <a href="https://ctrlq

Large file support not working in C programming

风流意气都作罢 提交于 2019-12-01 11:19:44
I'm trying to compile a shared object (that is eventually used in Python with ctypes). The command-line used to build the object is: gcc -Wall -O3 -shared -Wl,-soname,borg_stream -lm -m128bit-long-double -fPIC \ -D_FILE_OFFSET_BITS=64 -o borg_stream.so data_stream.c data_types.c \ file_operations.c float_half.c channels.c statistics.c index_stream.c helpers.c The library builds properly on a 32-bit OS and it does what it needs to for small files. However, it fails the unit tests for files larger than 4GB. In addition, it sets errno to EOVERFLOW when doing an fseek/ftell. However, if I printf

How to avoid OutOfMemory exception while reading large files in Java

試著忘記壹切 提交于 2019-12-01 10:56:47
问题 I am working on the application which reads large amounts of data from a file. Basically, I have a huge file (around 1.5 - 2 gigs) containing different objects (~5 to 10 millions of them per file). I need to read all of them and put them to different maps in the app. The problem is that the app runs out of memory while reading the objects at some point. Only when I set it to use -Xmx4096m - it can handle the file. But if the file will be larger, it won't be able to do that anymore. Here's the

XML: Process large data

橙三吉。 提交于 2019-12-01 10:39:22
What XML-parser do you recommend for the following purpose: The XML-file (formatted, containing whitespaces) is around 800 MB. It mostly contains three types of tag (let's call them n, w and r). They have an attribute called id which i'd have to search for, as fast as possible. Removing attributes I don't need could save around 30%, maybe a bit more. First part for optimizing the second part: Is there any good tool (command line linux and windows if possible) to easily remove unused attributes in certain tags? I know that XSLT could be used. Or are there any easy alternatives? Also, I could

Why can't my Perl program create files over 4 GB on Windows?

雨燕双飞 提交于 2019-12-01 10:31:27
问题 Why is the size of files capped at 4 GB when outputting to a file using print? I would expect that with streaming output it should be possible to generate files of arbitrary size. Update : ijw and Chas. Owens were correct. I thought the F: drive was NTFS formatted, but in fact it used the FAT32 filesystem. I tried it on another drive and I could generate a 20 GB text file. There are no limits in this case. Apologies to all. Details: while researching for answering a question here on Stack

Parsing Large XML file with Python lxml and Iterparse

倖福魔咒の 提交于 2019-12-01 09:54:44
I'm attempting to write a parser using lxml and the iterparse method to step through a very large xml file containing many items. My file is of the format: <item> <title>Item 1</title> <desc>Description 1</desc> <url> <item>http://www.url1.com</item> </url> </item> <item> <title>Item 2</title> <desc>Description 2</desc> <url> <item>http://www.url2.com</item> </url> </item> and so far my solution is: from lxml import etree context = etree.iterparse( MYFILE, tag='item' ) for event, elem in context : print elem.xpath( 'description/text( )' ) elem.clear( ) while elem.getprevious( ) is not None :

Handling very large images in Qt

99封情书 提交于 2019-12-01 09:33:59
I can't get Qt to work on images beyond 10,000X10,000. I'm dealing with huge satellite images that are around 2GB each. I considered using memory mapping but the image still occupies space in memory. QFile file("c://qt//a.ras"); file.open(QIODevice::ReadOnly); qint64 size = file.size(); uchar *img=file.map(0,size); QImage I(img,w,h,QImage::Format_ARGB32); Can anyone tell me a more efficient way to deal with large images in Qt? Qgraphicsview and a set of image tiles, the view handles all the scrolling and world coords for you. Then you just have to either pre-chop the images into tiles in

How can I portably turn on large file support?

落花浮王杯 提交于 2019-12-01 07:50:03
问题 I am currently writing a C program that reads and writes files that might be over 2 GiB in size. On linux feature_test_macros (7) specifies: _LARGEFILE64_SOURCE Expose definitions for the alternative API specified by the LFS (Large File Summit) as a "tran‐ sitional extension" to the Single UNIX Specification. (See ⟨http://opengroup.org/platform /lfs.html⟩) The alternative API consists of a set of new objects (i.e., functions and types) whose names are suffixed with "64" (e.g., off64_t versus

How do quickly search through a .csv file in Python

你。 提交于 2019-12-01 07:39:57
I'm reading a 6 million entry .csv file with Python, and I want to be able to search through this file for a particular entry. Are there any tricks to search the entire file? Should you read the whole thing into a dictionary or should you perform a search every time? I tried loading it into a dictionary but that took ages so I'm currently searching through the whole file every time which seems wasteful. Could I possibly utilize that the list is alphabetically ordered? (e.g. if the search word starts with "b" I only search from the line that includes the first word beginning with "b" to the