venerdì 3 luglio 2009

File Management in R: two recipes

Remove files with a specific pattern in R:

A quick basic tip which can come in handy whether you need to rapidly remove files from a directory:

junk <- dir(path="your_path",  pattern="your_pattern") # ?dir
file.remove(junk) # ?file.remove

Compress multiple files/folders in separate zip files:

This tip came handy to me when I had to compress into separate .cbz files (zip files with an other extension) a vast collection of folders containing scans for different numbers of a comic book series (to create .cbz files instead of zip files, just substitute .cbz to .zip in the following code).

l=basename(list.dirs(recursive=F))
for (i in 1:length(l)) zip(paste(l[i],".zip",sep=""),files=l[i]) # ?zip

Clearly, for advanced needs, you can use system() and all the unix tools installed onto your machine.

Note: This post was updated on 1/5/2012

8 commenti:

  1. yes but this solution is likely cross plateform :-) which makes it even nicer

    RispondiElimina
  2. Questo commento è stato eliminato dall'autore.

    RispondiElimina
  3. What a coincidence. I have just written some codes retrieving the file list with the mtime information. books<-file.info(dir("D:\\Readings",full.names=TRUE,recursive=TRUE)) . The problem is that there are hundreds of sub directories and thousands of files. So this methond is very time consuming since it read the directories twice indeed. I wonder whether there is any function could retrieve the file names and mtime together.
    Thanks in advance.

    RispondiElimina
  4. I am not sure to have understood properly what your problem is, but I hope that the below code could lead you to a solution:

    books<-file.info(dir(getwd(),full.names=TRUE,recursive=TRUE))
    out<-data.frame(filename=basename(rownames(books)),mtime=books$mtime)

    RispondiElimina
  5. Thanks Paolo.
    The dir() would scan the directories/files and file.info() would scan the directories/files again. So the directories/files would be scanned twice, which would take a long time if there is thousands of files. In dos, dir/a would list the necessary information with one scan.

    RispondiElimina
  6. It was really helpful!! Thanks Paolo :)

    RispondiElimina
  7. You are welcome! If you are THE Soma I think, I hope everything is all right! :-)

    RispondiElimina