Are there any good workarounds to the GitHub 100MB file size limit for text files?

我的梦境 提交于 2019-11-30 08:09:07

Clean and Smudge

You can use clean and smudge to compress your file. Normally, this isn't necessary, since git will compress it internally, but since gitHub is acting weird, it may help. The main commands would be like:

git config filter.compress.clean gzip
git config filter.compress.smudge gzip -d

GitHub will see this as a compressed file, but on each computer, it will appear to be a text file.

See https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes for more details.

Alternatively, you could have clean post to an online pastebin, and smudge fetch from the pastebin, such as http://pastebin.com/. Many other combinations are possible with clean and smudge.

A very good solution will be to use:

https://git-lfs.github.com/

Its an open source designed to work with Large files.

You can create a script/program in any language to divide or unite files.

Here an example to divide a file written in Java (I used Java because I feel more comfortable on Java than any other, but any other would work, some will be better than Java too).

public static void main(String[] args) throws Exception
{
    RandomAccessFile raf = new RandomAccessFile("test.csv", "r");
    long numSplits = 10; //from user input, extract it from args
    long sourceSize = raf.length();
    long bytesPerSplit = sourceSize/numSplits ;
    long remainingBytes = sourceSize % numSplits;

    int maxReadBufferSize = 8 * 1024; //8KB
    for(int destIx=1; destIx <= numSplits; destIx++) {
        BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+destIx));
        if(bytesPerSplit > maxReadBufferSize) {
            long numReads = bytesPerSplit/maxReadBufferSize;
            long numRemainingRead = bytesPerSplit % maxReadBufferSize;
            for(int i=0; i<numReads; i++) {
                readWrite(raf, bw, maxReadBufferSize);
            }
            if(numRemainingRead > 0) {
                readWrite(raf, bw, numRemainingRead);
            }
        }else {
            readWrite(raf, bw, bytesPerSplit);
        }
        bw.close();
    }
    if(remainingBytes > 0) {
        BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+(numSplits+1)));
        readWrite(raf, bw, remainingBytes);
        bw.close();
    }
        raf.close();
}

static void readWrite(RandomAccessFile raf, BufferedOutputStream bw, long numBytes) throws IOException {
    byte[] buf = new byte[(int) numBytes];
    int val = raf.read(buf);
    if(val != -1) {
        bw.write(buf);
    }
}

This will cost almost nothing (Time/Money).

Edit: You can create a Java executable and add it to your repository, or even easier, create a Python (Or any other language) script to do this, and save it as plain text on your repository.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!