I am relatively new in the \"large data process\" in r here, hope to look for some advise about how to deal with 50 GB csv file. The current problem is following:
Table
This is too long for a comment.
R -- in its basic configuration -- loads data into memory. Memory is cheap. 50 Gbytes still is not a typical configuration (and you would need more than that to load the data in and store it). If you are really good in R, you might be able to figure out another mechanism. If you have access to a cluster, you could use some parallel version of R or Spark.
You could also load the data into a database. For the task at hand, a database is very well suited to the problem. R easily connects to almost any database. And, you might find a database very useful for what you want to do.
Or, you could just process the text file in situ. Command line tools such as awk, grep, and perl are very suitable for this task. I would recommend this approach for a one-time effort. I would recommend a database if you want to keep the data around for analytic purposes.