REBOL
Verbose, perhaps, so definitely not a winner, but gets the job done.
min-length: 0
min-count: 0
common-words: [ "a" "an" "as" "and" "are" "by" "for" "from" "in" "is" "it" "its" "the" "of" "or" "to" "until" ]
add-word: func [
word [string!]
/local
count
letter
non-letter
temp
rules
match
][
; Strip out punctuation
temp: copy {}
letter: charset [ #"a" - #"z" #"A" - #"Z" #" " ]
non-letter: complement letter
rules: [
some [
copy match letter (append temp match)
|
non-letter
]
]
parse/all word rules
word: temp
; If we end up with nothing, bail
if 0 == length? word [
exit
]
; Check length
if min-length > length? word [
exit
]
; Ignore common words
ignore:
if find common-words word [
exit
]
; OK, its good. Add it.
either found? count: select words word [
words/(word): count + 1
][
repend words [word 1]
]
]
rules: [
some [
{"}
copy word to {"} (add-word word)
{"}
|
copy word to { } (add-word word)
{ }
]
end
]
words: copy []
parse/all read %c.txt rules
result: copy []
foreach word words [
if string? word [
count: words/:word
if count >= min-count [
append result word
]
]
]
sort result
foreach word result [ print word ]
The output is:
act
actions
all
allows
also
any
appear
arbitrary
arguments
assign
assigned
based
be
because
been
before
below
between
braces
branches
break
builtin
but
C
C like any other language has its blemishes Some of the operators have the wrong precedence some parts of the syntax could be better
call
called
calls
can
care
case
char
code
columnbased
comma
Comments
common
compiler
conditional
consisting
contain
contains
continue
control
controlflow
criticized
Cs
curly brackets
declarations
define
definitions
degree
delimiters
designated
directly
dowhile
each
effect
effects
either
enclosed
enclosing
end
entry
enum
evaluated
evaluation
evaluations
even
example
executed
execution
exert
expression
expressionExpressions
expressions
familiarity
file
followed
following
format
FORTRAN
freeform
function
functions
goto
has
high
However
identified
ifelse
imperative
include
including
initialization
innermost
int
integer
interleaved
Introduction
iterative
Kernighan
keywords
label
language
languages
languagesAlthough
leave
limit
lineEach
loop
looping
many
may
mimicked
modify
more
most
name
needed
new
next
nonstructured
normal
object
obtain
occur
often
omitted
on
operands
operator
operators
optimization
order
other
perhaps
permits
points
programmers
programming
provides
rather
reinitialization
reliable
requires
reserve
reserved
restrictions
results
return
Ritchie
say
scope
Sections
see
selects
semicolon
separate
sequence
sequence point
sequential
several
side
single
skip
sometimes
source
specify
statement
statements
storage
struct
Structured
structuresAs
such
supported
switch
syntax
testing
textlinebased
than
There
This
turn
type
types
union
Unlike
unspecified
use
used
uses
using
usually
value
values
variable
variables
variety
which
while
whitespace
widespread
will
within
writing