fread

What is the fastest way and fastest format for loading large data sets into R [duplicate]

风流意气都作罢 提交于 2020-06-25 07:02:34
问题 This question already has answers here : Quickly reading very large tables as dataframes (11 answers) Closed 4 years ago . I have a large dataset (about 13GB uncompressed) and I need to load it repeatedly. The first load (and save to a different format) can be very slow but every load after this should be as fast as possible. What is the fastest way and fastest format from which to load a data set? My suspicion is that the optimal choice is something like saveRDS(obj, file = 'bigdata.Rda',

Fast reading and combining several files using data.table (with fread)

好久不见. 提交于 2020-04-05 07:32:07
问题 I have several different txt files with the same structure. Now I want to read them into R using fread, and then union them into a bigger dataset. ## First put all file names into a list library(data.table) all.files <- list.files(path = "C:/Users",pattern = ".txt") ## Read data using fread readdata <- function(fn){ dt_temp <- fread(fn, sep=",") keycols <- c("ID", "date") setkeyv(dt_temp,keycols) # Notice there's a "v" after setkey with multiple keys return(dt_temp) } # then using mylist <-

fread和fseek的用法

随声附和 提交于 2020-04-04 00:05:47
原味:http://baike.baidu.com/view/656696.htm    http://baike.baidu.com/view/656689.htm fread   功 能: 从一个流中读数据   函数原型: size_t fread( void * buffer , size_t size , size_t count , FILE * stream );     参 数:   1.用于接收数据的地址(指针)( buffer )   2.单个元素的大小(size) :单位是字节而不是位,例如读取一个整型数就是2个字节   3.元素个数( count )   4.提供数据的文件指针(stream)   返回值:成功读取的元素个数 程序例 #include <stdio.h> int main(void) { FILE *stream; char msg[] = "this is a test"; char buf[20]; if ((stream = fopen("DUMMY.FIL","w+")) == NULL ) { fprintf(stderr,"Cannot open output file.\n"); return 1; } fwrite(msg,strlen(msg)+1,1,stream); fseek(stream,0,SEEK_SET);

【FBI WARNING】读入优化

安稳与你 提交于 2020-03-15 19:40:19
感觉这个优化挺重要的,毕竟在2017NOIP提高的那个奶酪问题 写出代码来 就只是读入感觉就超时 过了样例又如何? 最后还不是TLE。 对于输入数据非常大的一些可(变)爱(态)题目,scanf就会大大拖慢程序的运行速度,cin就更不用说了,所以我们要用一种高大上的东西——读入优化。 读入优化的原理其实就是一个一个字符的读入,再组成数字 ---Peper(我百度找的) 插入一套比较简单的读入优化 当然 对于我来说还是比较简(困)单(难)的。 1 int read() 2 { 3 int x=0,f=1; 4 char ch; 5 while(ch<'0'||ch>'9') {if(ch=='-')f=-1;ch=getchar();} 6 while(ch>='0'&&ch<='9'){x=x*10+ch-'0';ch=getchar();} 7 return f*x; 8 } 这就是 最基本 的读入优化,通过getchar函数依次读入字符,用x记录答案,用f判断正负。 ps:在主程序中调用时int n=read();即可 ===========================前方高能预警=============================== 在插一套某大佬的版子 template<class T>void read(T &x) { x=0;int f=0;char ch