Getting MimeType subtype with Apache tika

前端 未结 4 965
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-29 10:35

I\'d need to get the iana.org MediaType rather than application/zip or application/x-tika-msoffice for documents like, odt, ppt, pptx, xlsx etc.

If you look at mim

4条回答
  •  悲&欢浪女
    2020-12-29 11:01

    You can use a custom tika config file:

    MimeTypes mimes=MimeTypesFactory.create(Thread.currentThread()
       .getContextClassLoader().getResource("tika-custom-MimeTypes.xml"));
    Metadata metadata = new Metadata();
    metadata.add(Metadata.RESOURCE_NAME_KEY, file.getName());
    tis = TikaInputStream.get(file);
    String mimetype = new  DefaultDetector(mimes).detect(tis,metadata).toString();
    

    In the WEB-INF/classes put the "tika-custom-MimeTypes.xml" with your changes:

    In my case:

    
        
          
          
          
          
        
        
        
        
        
    
    
        
          
          
          
          
        
        
        
    
    

提交回复
热议问题