Processing a file character by character in Clojure

后端未结

关注

 3  1991

礼貌的吻别 2020-12-20 21:38

I\'m working on writing a function in Clojure that will process a file character by character. I know that Java\'s BufferedReader class has the read() method that reads one

3条回答

南方客 (楼主)

2020-12-20 22:12
I'm not familiar with Java or the read() method, so I won't be able to help you out with implementing it.

One first thought is maybe to simplify by using slurp, which will return a string of the text of the entire file with just (slurp filename). However, this would get the whole file, which maybe you don't want.

Once you have a string of the entire file text, you can process any string character by character by simply treating it as though it were a sequence of characters. For example:
```
=> (doseq [c "abcd"]
     (prntln c))
a
b
c
d
=> nil
```
Or:
```
=> (remove #{\c} "abcd")
=> (\a \b \d)
```
You could use map or reduce or any sort of sequence manipulating function. Note that after manipulating it like a sequence, it will now return as a sequence, but you could easily wrap the outer part in (reduce str ...) to return it back to a string at the end--explicitly:
```
=> (reduce str (remove #{\c} "abcd"))
=> "abd"
```
As for your problem with your specific code, I think the problem lies with what words is: a vector of strings. When you print each words you are printing a vector. If at the end you replaced the line (println words) with (doseq [w words] (println w))), then it should work great.

Also, based on what you say you want your output to look like (a vector of all the different words in the file), you wouldn't want to only do (println w) at the base of your expression, because this will print values and return nil. You would simply want w. Also, you would want to replace your doseqs with fors--again, to avoid return nil.

Also, on improving your code, it looks generally great to me, but--and this is going with all the first change I suggest above (but not the others, because I don't want to draw it all out explicitly)--you could shorten it with a fun little trick:
```
(doseq [item seq]
        (let [words (split item #"\s")]
            (doseq [w words]
              (println w))))

;//Could be rewritten as...

(doseq [item s
        :let [words (split item #"\s")]
        w words]
  (println w))
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...