ff

How to deal with a 50GB large csv file in r language?

送分小仙女□ 提交于 2020-08-21 06:44:33
问题 I am relatively new in the "large data process" in r here, hope to look for some advise about how to deal with 50 GB csv file. The current problem is following: Table is looked like: ID,Address,City,States,... (50 more fields of characteristics of a house) 1,1,1st street,Chicago,IL,... # the first 1 is caused by write.csv, they created an index raw in the file I would like to find all rows that is belonging San Francisco, CA. It supposed to be an easy problem, but the csv is too large. I know

How to deal with a 50GB large csv file in r language?

我只是一个虾纸丫 提交于 2020-08-21 06:43:58
问题 I am relatively new in the "large data process" in r here, hope to look for some advise about how to deal with 50 GB csv file. The current problem is following: Table is looked like: ID,Address,City,States,... (50 more fields of characteristics of a house) 1,1,1st street,Chicago,IL,... # the first 1 is caused by write.csv, they created an index raw in the file I would like to find all rows that is belonging San Francisco, CA. It supposed to be an easy problem, but the csv is too large. I know

细节决定成败--chrome那些优秀的地方

送分小仙女□ 提交于 2020-04-07 06:55:13
1)自带中文分词 随便打开OSC的一篇文章,双击一个汉字看看会有什么变化!没错,chrome居然能够识别中文词组! IE8始终是选中一个字,Firefox强一点,是选中一个片段。不过对比chrome,他们都太逊色了。 2)简洁的右键菜单 还有浏览器比chrome右键更为简洁清爽的吗? 不足之处就是如果你经常查看网页源码,可能会点到“翻译本页” 3)完美的“选项”设计 就用户体验方面来说,IE和Firefox在选项设置方面远不及chrome。 首先,IE和FF在设置选项的时候竟然独占了整个浏览器进程! 独占进程后你无法查看其他网页,仔细想想这会导致什么问题呢。 独占进程说明浏览器的设计者已经假设你对设置选项十分熟悉,不需要借助网络资源或者其他帮助。 如果你对选项设置的布局和功能一无所知怎么办? 那就再开个chrome然后google之,然后对着网页来回在FF和chrome中切换吧,别无他法! 4)完美的“选项”功能-搜索 搜索功能真的很有用!如果你需要修改浏览器的代理或者cookie等,在ff和IE中你需要来回在多个标签页中切换。 当然,如果你记忆力超群,记得每一项的具体位置,那当我没说。 在chrome中,你完全没有必要这么做,点击扳手,打开选项,偌大的搜索框在那里引导你! 更重要的是,搜索的关键字还挺强大! 因为chrome的管理界面本身也是基于HTML的

MFC - 删除指定文件夹

百般思念 提交于 2020-04-02 21:00:15
1 // 删除指定的文件夹 2 void DeleteDirectory(CString strDir) 3 { 4 if (strDir.IsEmpty()) 5 { 6 RemoveDirectory(strDir); 7 return; 8 } 9 10 //首先删除文件及子文件夹 11 CFileFind ff; 12 BOOL bFound = ff.FindFile(strDir + _T("\\*"), 0); 13 while (bFound) 14 { 15 bFound = ff.FindNextFile(); 16 if (ff.GetFileName() == _T(".") || ff.GetFileName() == _T("..")) continue; 17 18 //去掉文件(夹)只读等属性 19 SetFileAttributes(ff.GetFilePath(), FILE_ATTRIBUTE_NORMAL); 20 if (ff.IsDirectory()) 21 { 22 //递归删除子文件夹 23 DeleteDirectory(ff.GetFilePath()); 24 RemoveDirectory(ff.GetFilePath()); 25 } 26 else 27 { 28 DeleteFile(ff.GetFilePath())

[AHOI2014/JSOI2014]支线剧情

旧时模样 提交于 2020-02-17 00:06:57
题目 有源汇上下界最小费用可行流 首先注意到要求是每一条边都经过至少一次,所以对于每一条边我们设成 \([1,\infty]\) 就好了 另外所有点都能结束剧情,所有点都要向汇点 \(t\) 连一条 \([0,\infty]\) 的边 我们根据有源汇可行流的方式建图就好了 定义 \(d_i\) 为流入这个点的所有边的下界和减去流出这个点的所有边的下界和 对于图中的一条边 \((u,v,[b,c],w)\) ,我们连一条从 \(u\) 到 \(v\) 流量为 \(c-b\) 费用为 \(w\) 的边 我们再从汇点向源点连一条容量为 \(\infty\) 费用为 \(0\) 的边 对于 \(d_i>0\) 的点,我们从超级源点 \(S\) 向这个点连一条容量为 \(d_i\) 费用为 \(0\) 的边 对于 \(d_i<0\) 的点,我们让这个点向超级汇点连一条容量为 \(-d_i\) 费用为 \(0\) 的边 我们在这张图上跑一个最小费用最大流就好了 最后别忘了把答案加上原图里所有边的流量下界乘以费用的和 代码 #include<queue> #include<cstdio> #include<cstring> #include<iostream> #include<algorithm> #define re register #define LL long long #define

Kriging simulation using ff package

空扰寡人 提交于 2020-01-06 04:57:26
问题 I'm trying to understand the way I can use the ff package to overcome the error "Error: cannot allocate vector of size 1.1 Mb" while using kriging/ gaussian simulation. I don't know how to change the input data. Is there any idea to help me do that? I'm using the gstat package to perform the simulation as follows: library(sp) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) gridded(meuse.grid) = ~x+y m <- vgm(.59, "Sph", 874, .04) # ordinary kriging: x <- krige(log(zinc)~1, meuse, meuse

How to column bind two ffdf

一个人想着一个人 提交于 2019-12-29 09:16:05
问题 Suppose two ffdf files: library(ff) ff1 <- as.ffdf(data.frame(matrix(rnorm(10*10),ncol=10))) ff2 <- ff1 colnames(ff2) <- 1:10 How can I column bind these without loading them into memory? cbind doesn't work. There is the same question http://stackoverflow.com/questions/18355686/columnbind-ff-data-frames-in-r but it does not have an MWE and the author abandoned it so I reposted. 回答1: You can use the following construct cbind.ffdf2 , making sure the column names of the two input ffdf 's are not

Row limit in read.table.ffdf?

人盡茶涼 提交于 2019-12-25 06:29:58
问题 I'm trying to import a very large dataset (101 GB) from a text file using read.table.ffdf in package ff. The dataset has >285 million records, but I am only able to read in the first 169,457,332 rows. The dataset is tab-separated with 44 variable-width columns. I've searched stackoverflow and other message boards and have tried many fixes, but still am consistently only able to import the same number of records. Here's my code: relFeb2016.test <- read.table.ffdf(x = NULL, file="D:/eBird/ebd

Row limit in read.table.ffdf?

让人想犯罪 __ 提交于 2019-12-25 06:27:07
问题 I'm trying to import a very large dataset (101 GB) from a text file using read.table.ffdf in package ff. The dataset has >285 million records, but I am only able to read in the first 169,457,332 rows. The dataset is tab-separated with 44 variable-width columns. I've searched stackoverflow and other message boards and have tried many fixes, but still am consistently only able to import the same number of records. Here's my code: relFeb2016.test <- read.table.ffdf(x = NULL, file="D:/eBird/ebd