large-files

Nuking huge file in svn repository

心不动则不痛 提交于 2020-01-11 21:02:15
问题 As the local subversion czar i explain to everyone to keep only source code and non-huge text files in the repository, not huge binary data files. Smaller binary files that are parts of tests, maybe. Unfortunately i work with humans ! Someone is likely to someday accidentally commit a 800MB binary hulk. This slows down repository operations. Last time i checked, you can't delete a file from the repository; only make it not part of the latest revision. The repository keeps the monster for all

Count subsequences in hundreds of GB of data

你说的曾经没有我的故事 提交于 2020-01-10 20:11:28
问题 I'm trying to process a very large file and tally the frequency of all sequences of a certain length in the file. To illustrate what I'm doing, consider a small input file containing the sequence abcdefabcgbacbdebdbbcaebfebfebfeb Below, the code reads the whole file in, and takes the first substring of length n (below I set this to 5, although I want to be able to change this) and counts its frequency: abcde => 1 Next line, it moves one character to the right and does the same: bcdef => 1 It

Grep multiple strings on large files

你说的曾经没有我的故事 提交于 2020-01-07 05:41:20
问题 I have a large number of large log files (each log file is around 200mb and I have 200GB data in total). Every 10 minutes, server writes to the log file about 10K parameters (with a timestamp). Out of each 10K parameters, I want to extract 100 of them to a new file. First I used grep with 1 parameter, then LC_ALL=C made it a little bit faster, then I used fgrep it was also slightly faster. Then I used parallel parallel -j 2 --pipe --block 20M and finally, for every 200MB, I was able to

Downloading large files using PHP

戏子无情 提交于 2020-01-06 23:46:08
问题 I am using following code to download files from some remote server using php //some php page parsing code $url = 'http://www.domain.com/'.$fn; $path = 'myfolder/'.$fn; $fp = fopen($path, 'w'); $ch = curl_init($url); curl_setopt($ch, CURLOPT_FILE, $fp); $data = curl_exec($ch); curl_close($ch); fclose($fp); // some more code but instead of downloading and saving the file in the directory it is showing the file contents (junk characters as file is zip) directly on the browser only. I guess it

How can I seekg() files over 4GB on Windows?

*爱你&永不变心* 提交于 2020-01-04 13:30:53
问题 I am on Windows 7, 64bit and NTFS. I am building a DLL that must be 32 bit. I have a very simple routine that I'd like to implement in C++. I'm reading a large file using: unsigned long p; ifstream source(file); streampos pp(p); source.seekg(pp); For files over 4GB I tried using unsigned long long but it's not working. What am I doing wrong? I'm using GNU GCC, would it be of any use trying MSVC Express 2008/2010? Update: There seems that something is wrong with my GCC. Right now I'm testing

How can I seekg() files over 4GB on Windows?

佐手、 提交于 2020-01-04 13:28:52
问题 I am on Windows 7, 64bit and NTFS. I am building a DLL that must be 32 bit. I have a very simple routine that I'd like to implement in C++. I'm reading a large file using: unsigned long p; ifstream source(file); streampos pp(p); source.seekg(pp); For files over 4GB I tried using unsigned long long but it's not working. What am I doing wrong? I'm using GNU GCC, would it be of any use trying MSVC Express 2008/2010? Update: There seems that something is wrong with my GCC. Right now I'm testing

programming files of size larger than 2 GB using C#.Net

谁说我不能喝 提交于 2020-01-04 09:14:12
问题 How to write large content to disk dynamically using c sharp. any advice or reference is appreciated. Iam trying to create a file(custom format and extension)and writing to it. The User will upload a file and its contents are converted to byte stream and is written to the file(filename.hd).The indexing of the uploaded files is done in another file(filename.hi). This works fine for me when the "filename.hd" file size is 2 GB when it exceeds 2GB it is not allowing me to add the content.This is

Import csv file with more than 1024 columns into new SQL Server table

偶尔善良 提交于 2020-01-04 05:42:24
问题 I am trying to upload data from CSV file into SQL Server which is 2GB in size and has more than 10000 columns. Please let me know how to load data with more than 1024 columns in SQL Server. I tried to do through Import/Export wizard,it threw below error Error 0xc002f210: Preparation SQL Task 1: Executing the query "CREATE TABLE [dbo].[Test] ( [ID] varchar(50)..." failed with the following error: "CREATE TABLE failed because column 'B19037Dm38' in table 'Test' exceeds the maximum of 1024

OpenCV will not load a big image (~4GB)

北城以北 提交于 2020-01-03 08:27:12
问题 I'm working on a program that is to detect colored ground control points from a rather large image. The TIFF image is some 3 - 4 GB (aboud 35 000 x 33 000 pix). I am using Python 2, and OpenCV to do the image processing. import cv2 img = 'ortho.tif' I = cv2.imread(img, cv2.IMREAD_COLOR) This part does not (always) produce an error message. While showing the image does: cv2.imshow('image', I) I have also tried showing the image by using matplotlib: plt.imshow(I[:, :, ::-1]) # Hack to change

Very large zip file (> 50GB) --> ZipException: invalid CEN header

被刻印的时光 ゝ 提交于 2020-01-03 08:23:13
问题 I'm trying to open a ZIP file in JAVA. The code below works fine except with some large files in which case I get the following exception: Exception in thread "main" java.util.zip.ZipException: invalid CEN header (bad signature) at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.<init>(ZipFile.java:114) at java.util.zip.ZipFile.<init>(ZipFile.java:75) Is there a known bug? Can it be due to higher compression level not supported in JAVA? Note that I can not use Winzip to